-
- Weitere Informationen zu diesem Buch:
Inhaltsverzeichnis | Index | Probekapitel | Rezensionen |
Leseprobe in dt. Sprache (PDF) |
- Weitere Informationen zu diesem Buch:
JETZT ONLINE BESTELLEN
Leading Programmers Explain How They Think
First Edition Juli 2007
ISBN 978-0-596-51004-6
618 Seiten
EUR37.00
Weitere Informationen zu diesem Buch
Inhaltsverzeichnis |
Index |
Probekapitel |
Rezensionen |
Leseprobe in dt. Sprache (PDF) |
Inhaltsverzeichnis
- Chapter 1: A Regular Expression Matcher
- InhaltsvorschauBrian KernighanRegular expressions are notations for describing patterns of text and, in effect, make up a special-purpose language for pattern matching. Although there are myriad variants, all share the idea that most characters in a pattern match literal occurrences of themselves, but some metacharacters have special meaning, such as * to indicate some kind of repetition or […] to mean any one character from the set within the brackets.In practice, most searches in programs such as text editors are for literal words, so the regular expressions are often literal strings like
print, which will matchprintforsprintorprinter paperanywhere. In so-called wildcards used to specify filenames in Unix and Windows, a * matches any number of characters, so the pattern*.cmatches all filenames that end in.c.There are many, many variants of regular expressions, even in contexts where one would expect them to be the same. Jeffrey Friedl's Mastering Regular Expressions (O'Reilly) is an exhaustive study of the topic.Stephen Kleene invented regular expressions in the mid-1950s as a notation for finite automata; in fact, they are equivalent to finite automata in what they represent. They first appeared in a program setting in Ken Thompson's version of the QED text editor in the mid-1960s. In 1967, Thompson applied for a patent on a mechanism for rapid text matching based on regular expressions. The patent was granted in 1971, one of the very first software patents [U.S. Patent 3,568,156, Text Matching Algorithm, March 2, 1971].Regular expressions moved from QED to the Unix editor ed, and then to the quintessential Unix tool grep, which Thompson created by performing radical surgery on ed. These widely used programs helped regular expressions become familiar throughout the early Unix community.Thompson's original matcher was very fast because it combined two independent ideas. One was to generate machine instructions on the fly during matching so that it ran at machine speed rather than by interpretation. The other was to carry forward all possible matches at each stage, so it did not have to backtrack to look for alternative potential matches. In later text editors that Thompson wrote, such asEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - The Practice of Programming
- InhaltsvorschauIn 1998, Rob Pike and I were writing The Practice of Programming (Addison-Wesley). The last chapter of the book, "Notation," collected a number of examples where good notation led to better programs and better programming. This included the use of simple data specifications (
printf, for instance), and the generation of code from tables.Because of our Unix backgrounds and nearly 30 years of experience with tools based on regular expression notation, we naturally wanted to include a discussion of regular expressions, and it seemed mandatory to include an implementation as well. Given our emphasis on tools, it also seemed best to focus on the class of regular expressions found in grep—rather than, say, those from shell wildcards—since we could also then talk about the design of grep itself.The problem was that any existing regular expression package was far too big. The local grep was over 500 lines long (about 10 book pages) and encrusted with barnacles. Open source regular expression packages tended to be huge—roughly the size of the entire book—because they were engineered for generality, flexibility, and speed; none were remotely suitable for pedagogy.I suggested to Rob that we find the smallest regular expression package that would illustrate the basic ideas while still recognizing a useful and nontrivial class of patterns. Ideally, the code would fit on a single page.Rob disappeared into his office. As I remember it now, he emerged in no more than an hour or two with the 30 lines of C code that subsequently appeared in of The Practice of Programming. That code implements a regular expression matcher that handles the following constructs.CharacterMeaningcMatches any literal character c.. (period)Matches any single character.^Matches the beginning of the input string.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Implementation
- InhaltsvorschauIn The Practice of Programming, the regular expression matcher is part of a standalone program that mimics grep, but the regular expression code is completely separable from its surroundings. The main program is not interesting here; like many Unix tools, it reads either its standard input or a sequence of files, and prints those lines that contain a match of the regular expression.This is the matching code:
/* match: search for regexp anywhere in text */ int match(char *regexp, char *text) { if (regexp[0] == '^') return matchhere(regexp+1, text); do { /* must look even if string is empty */ if (matchhere(regexp, text)) return 1; } while (*text++ != '\0'); return 0; } /* matchhere: search for regexp at beginning of text */ int matchhere(char *regexp, char *text) { if (regexp[0] == '\0') return 1; if (regexp[1] == '*') return matchstar(regexp[0], regexp+2, text); if (regexp[0] == '$' && regexp[1] == '\0') return *text == '\0'; if (*text!='\0' && (regexp[0]=='.' || regexp[0]==*text)) returnmatchhere(regexp+1, text+1); return 0; } /* matchstar: search for c*regexp at beginning of text */ int matchstar(int c, char *regexp, char *text) { do { /* a * matches zero or more instances */ if (matchhere(regexp, text)) return 1; } while (*text != '\0' && (*text++ == c || c == '.')); return 0; }Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Discussion
- InhaltsvorschauThe function
match(regexp, text) tests whether there is an occurrence of the regular expression anywhere within the text; it returns 1 if a match is found and 0 if not. If there is more than one match, it finds the leftmost and shortest.The basic operation ofmatchis straightforward. If the first character of the regular expression is ^ (an anchored match), any possible match must occur at the beginning of the string. That is, if the regular expression is^xyz, it matchesxyzonly ifxyzoccurs at the beginning of the text, not somewhere in the middle. This is tested by matching the rest of the regular expression against the text starting at the beginning and nowhere else. Otherwise, the regular expression might match anywhere within the string. This is tested by matching the pattern against each character position of the text in turn. If there are multiple matches, only the first (leftmost) one will be identified. That is, if the regular expression isxyz, it will match the first occurrence ofxyzregardless of where it occurs.Notice that advancing over the input string is done with ado-whileloop, a comparatively unusual construct in C programs. The occurrence of ado-whileinstead of awhileshould always raise a question: why isn't the loop termination condition being tested at the beginning of the loop, before it's too late, rather than at the end after something has been done? But the test is correct here: since the * operator permits zero-length matches, we first have to check whether a null match is possible.The bulk of the work is done in the functionmatchhere(regexp, text), which tests whether the regular expression matches the text that begins right here. The functionmatchhereoperates by attempting to match the first character of the regular expression with the first character of the text. If the match fails, there can be no match at this text position andmatchhereEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Alternatives
- InhaltsvorschauThis is a very elegant and well-written piece of code, but it's not perfect. What might we do differently? I might rearrange
matchhereto deal with $ before *. Although it makes no difference here, it feels a bit more natural, and a good rule is to do easy cases before difficult ones.In general, however, the order of tests is critical. For instance, in this test frommatchstar:} while (*text != '\0' && (*text++ == c || c == '.'));
we must advance over one more character of the text string no matter what, so the increment intext++must always be performed.This code is careful about termination conditions. Generally, the success of a match is determined by whether the regular expression runs out at the same time as the text does. If they do run out together, that indicates a match; if one runs out before the other, there is no match. This is perhaps most obvious in a line like:if (regexp[0] == '$' && regexp[1] == '\0') return *text == '\0';
but subtle termination conditions show up in other cases as well.The version ofmatchstarthat implements leftmost longest matching begins by identifying a maximal sequence of occurrences of the input characterc. Then it usesmatchhereto try to extend the match to the rest of the pattern and the rest of the text. Each failure reduces the number ofcs by one and tries again, including the case of zero occurrences:/* matchstar: leftmost longest search for c*regexp */ int matchstar(int c, char *regexp, char *text) { char *t; for (t = text; *t != '\0' && (*t == c || c == '.'); t++) ; do { /* * matches zero or more */ if (matchhere(regexp, t)) return 1; } while (t-- > text); return 0; }Consider the regular expression (.*), which matches arbitrary text within parentheses. Given the target text:for (t = text; *t != '\0' && (*t == c || c == '.'); t++)
a longest match from the beginning will identify the entire parenthesized expression, while a shortest match will stop at the first right parenthesis. (Of course a longest match beginning from the second left parenthesis will extend to the end of the text.)Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Building on It
- InhaltsvorschauThe purpose of The Practice of Programming was to teach good programming. At the time the book was written, Rob and I were still at Bell Labs, so we did not have firsthand experience of how the book would be best used in a classroom. It has been gratifying to discover that some of the material does work well in classes. I have used this code since 2000 as a vehicle for teaching important points about programming.First, it shows how recursion is useful and leads to clean code in a novel setting; it's not yet another version of Quicksort (or factorial!), nor is it some kind of tree walk.It's also a good example for performance experiments. Its performance is not very different from the system versions of grep, which shows that the recursive technique is not too costly and that it's not worth trying to tune the code.On the other hand, it is also a fine illustration of the importance of a good algorithm. If a pattern includes several .* sequences, the straightforward implementation requires a lot of backtracking, and, in some cases, will run very slowly indeed.The standard Unix grep has the same backtracking properties. For example, the command:
grep 'a.*a.*a.*a.a'
takes about 20 seconds to process a 4 MB text file on a typical machine.An implementation based on converting a nondeterministic finite automaton to a deterministic automaton, as in egrep, will have much better performance on hard cases; it can process the same pattern and the same input in less than one-tenth of a second, and running time in general is independent of the pattern.Extensions to the regular expression class can form the basis of a variety of assignments. For example:-
Add other metacharacters, such as + for one or more occurrences of the previous character, or ? for zero or one matches. Add some way to quote metacharacters, such as \$ to stand for a literal occurrence of $.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Conclusion
- InhaltsvorschauI was amazed by how compact and elegant this code was when Rob Pike first wrote it—it was much smaller and more powerful than I had thought possible. In hindsight, one can see a number of reasons why the code is so small.First, the features are well chosen to be the most useful and to give the most insight into implementation, without any frills. For example, the implementation of the anchored patterns ^ and $ requires only three or four lines, but it shows how to deal with special cases cleanly before handling the general cases uniformly. The closure operation * must be present because it is a fundamental notion in regular expressions and provides the only way to handle patterns of unspecified lengths. But it would add no insight to also provide + and ?, so those are left as exercises.Second, recursion is a win. This fundamental programming technique almost always leads to smaller, cleaner, and more elegant code than the equivalent written with explicit loops, and that is the case here. The idea of peeling off one matching character from the front of the regular expression and from the text, then recursing for the rest, echoes the recursive structure of the traditional factorial or string length examples, but in a much more interesting and useful setting.Third, this code really uses the underlying language to good effect. Pointers can be misused, of course, but here they are used to create compact expressions that naturally express the extracting of individual characters and advancing to the next character. Array indexing or substrings can achieve the same effect, but in this code, pointers do a better job, especially when coupled with C idioms for autoincrement and implicit conversion of truth values.I don't know of another piece of code that does so much in so few lines while providing such a rich source of insight and further ideas.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 2: Subversion's Delta Editor: Interface As Ontology
- InhaltsvorschauKarl FogelExamples of beautiful code tend to be local solutions to well-bounded, easily comprehensible problems, such as Duff's Device (http://en.wikipedia.org/wiki/Duff's_device) or rsync's rolling checksum algorithm (http://en.wikipedia.org/wiki/Rsync#Algorithm). This is not because small, simple solutions are the only beautiful kind, but because appreciating complex code requires more context than can be given on the back of a napkin.Here, with the luxury of several pages to work in, I'd like to talk about a larger sort of beauty—not necessarily the kind that would strike a passing reader immediately, but the kind that programmers who work with the code on a regular basis would come to appreciate as they accumulate experience with the problem domain. My example is not an algorithm, but an interface: the programming interface used by the open source version control system Subversion (http://subversion.tigris.org) to express the difference between two directory trees, which is also the interface used to transform one tree into the other. In Subversion, its formal name is the C type
svn_delta_editor_t, but it is known colloquially as the delta editor.Subversion's delta editor demonstrates the properties that programmers look for in good design. It breaks down the problem along boundaries so natural that anyone designing a new feature for Subversion can easily tell when to call each function, and for what purpose. It presents the programmer with uncontrived opportunities to maximize efficiency (such as by eliminating unnecessary data transfers over the network) and allows for easy integration of auxiliary tasks (such as progress reporting). Perhaps most important, the design has proved very resilient during enhancements and updates.And as if to confirm suspicions about the origins of good design, the delta editor was created by a single person over the course of a few hours (although that person was very familiar with the problem and the code base).Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Version Control and Tree Transformation
- InhaltsvorschauVery early in the Subversion project, the team realized we had a general task that would be performed over and over: that of minimally expressing the difference between two similar (usually related) directory trees. As a version control system, one of Subversion's goals is to track revisions to directory structures as well as individual file contents. In fact, Subversion's server-side repository is fundamentally designed around directory versioning. A repository is simply a series of snapshots of a directory tree as that tree transforms over time. For each changeset committed to the repository, a new tree is created, differing from the preceding tree exactly where the changes are located and nowhere else. The unchanged portions of the new tree share storage with the preceding tree, and so on back into time. Each successive version of the tree is labeled with a monotonically increasing integer; this unique identifier is called a revision number.Think of the repository as an array of revision numbers, stretching off into infinity. By convention, revision 0 is always an empty directory. In , revision 1 has a tree hanging off it (typically the initial import of content into the repository), and no other revisions have been committed yet. The boxes represent nodes in this virtual filesystem: each node is either a directory (labeled DIR in the upper-right corner) or a file (labeled FILE).What happens when we modify tuna? First, we make a new file node, containing the latest text. The new node is not connected to anything yet. As shows, it's just hanging out there in space, with no name.Next, we create a new revision of its parent directory. As shows, the subgraph is still not connected to the revision array.
Figure 2-1: Conceptual view of revision numbersEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Expressing Tree Differences
- InhaltsvorschauThe most common action in Subversion is to transmit changes between the two sides: from the repository to the working copy when doing an update to receive others' changes, and from the working copy to the repository when committing one's own changes. Expressing the difference between two trees is also key to many other common operations—e.g., diffing, switching to a branch, merging changes from one branch to another, and so on.Clearly it would be silly to have two different interfaces, one for server → client and another for client → server. The underlying task is the same in both cases. A tree difference is a tree difference, regardless of which direction it's traveling over the network or what its consumer intends to do with it. But finding a natural way to express tree differences proved surprisingly challenging. Complicating matters further, Subversion supports multiple network protocols and multiple backend storage mechanisms; we needed an interface that would look the same across all of those.Our initial attempts to come up with an interface ranged from unsatisfying to downright awkward. I won't describe them all here, but what they had in common was that they tended to leave open questions for which there were no persuasive answers.For example, many of the solutions involved transmitting the changed paths as strings, either as full paths or path components. Well, what order should the paths be transmitted in? Depth first? Breadth first? Random order? Alphabetical? Should there be different commands for directories than for files? Most importantly, how would each individual command expressing a difference know that it was part of a larger operation grouping all the changes into a unified set? In Subversion, the concept of the overall tree operation is quite user-visible, and if the programmatic interface didn't intrinsically match that concept, we'd surely need to write lots of brittle glue code to compensate.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- The Delta Editor Interface
- InhaltsvorschauFollowing is a mildly abridged version of the delta editor interface. I've left out the parts that deal with copying and renaming, the parts related to Subversion properties (properties are versioned metadata, and are not important here), and parts that handle some other Subversion-specific bookkeeping. However, you can always see the latest version of the delta editor by visiting http://svn.collab.net/repos/svn/trunk/subversion/include/svn_delta.h. This chapter is based on r21731 (that is, revision 21731) at http://svn.collab.net/viewvc/svn/trunk/subversion/include/svn_delta.h?revision=21731.To understand the interface, even in its abridged form, you'll need to know some Subversion jargon:
- pools
-
The
poolarguments are memory pools—allocation buffers that allow a large number of objects to be freed simultaneously. svn_error_t-
The return type
svn_error_tsimply means that the function returns a pointer to a Subversion error object; a successful call returns a null pointer. - text delta
-
A text delta is the difference between two different versions of a file; you can apply a text delta as a patch to one version of the file to produce the other version. In Subversion, the "text" of a file is considered binary data—it doesn't matter whether the file is plain text, audio data, an image, or something else. Text deltas are expressed as streams of fixed-sized windows, each window containing a chunk of binary diff data. This way, peak memory usage is proportional to the size of a single window, rather than to the total size of the patch (which might be quite large in the case of, say, an image file).
- window handler
-
This is the function prototype for applying one window of text-delta data to a target file.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - But Is It Art?
- InhaltsvorschauI cannot claim that the beauty of this interface was immediately obvious to me. I'm not sure it was obvious to Jim either; he was probably just trying to get Ben and me out of his house. But he'd been pondering the problem for a long time, too, and he followed his instincts about how tree structures behave.The first thing that strikes one about the delta editor is that it chooses constraint: even though there is no philosophical requirement that tree edits be done in depth-first order (or indeed in any order at all), the interface enforces depth-firstness anyway, by means of the baton relationships. This makes the interface's usage and behavior more predictable.The second thing is that an entire edit operation unobtrusively carries its context with it, again by means of the batons. A file baton can contain a pointer to its parent directory baton, a directory baton can contain a pointer to its parent directory baton (with a null parent for the root of the edit), and everyone can contain a pointer to the global edit baton. Although an individual baton may be a disposable object—for example, when a file is closed, its baton is destroyed—any given baton allows access to the global edit context, which may contain, for example, the revision number the client side is being updated to. Thus, batons are overloaded: they provide scope (i.e., lifetime, because a baton only lasts as long as the pool in which it is allocated) to portions of the edit, but they also carry global context.The third important feature is that the interface provides clear boundaries between the various suboperations involved in expressing a tree change. For example, opening a file merely indicates that something changed in that file between the two trees, but doesn't give details; calling
apply_textdeltagives the details, but you don't have to callapply_ textdeltaif you don't want to. Similarly, opening a directory indicates that something changed in or under that directory, but if you don't need to say any more than that, you can just close the directory and move on. These boundaries are a consequence of the interface's dedication toEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Abstraction As a Spectator Sport
- InhaltsvorschauOur next indication of the delta editor's flexibility came when we needed to do two or more distinct things in the same tree edit. One of the earliest such situations was the need to handle cancellations. When the user interrupted an update, a signal handler trapped the request and set a flag; then at various points during the operation, we checked the flag and exited cleanly if it was set. It turned out that in most cases, the safest place to exit the operation was simply the next entry or exit boundary of an editor function call. This was trivially true for operations that performed no I/O on the client side (such as change summarizations and diffs), but it was also true of many operations that did touch files. After all, most of the work in an update is simply writing out the data, and even if the user interrupts the overall update, it usually still makes sense to either finish writing or cleanly cancel whatever file was in progress when the interrupt was detected.But where to implement the flag checks? We could hardcode them into the delta editor, the one returned (by reference) from
Get_Update_Editor(). But that's obviously a poor choice: the delta editor is a library function that might be called from code that wants a totally different style of cancellation checking, or none at all.A slightly better solution would be to pass a cancellation-checking callback function and associated baton toGet_Update_Editor(). The returned editor would periodically invoke the callback on the baton and, depending on the return value, either continue as normal or return early (if the callback is null, it is never invoked). But that arrangement isn't ideal either. Checking cancellation is really a parasitic goal: you might want to do it when updating, or you might not, but in any case it has no effect on the way the update process itself works. Ideally, the two shouldn't be tangled together in the code, especially as we had concluded that, for the most part, operations didn't need fine-grained control over cancellation checking, anyway—the editor call boundaries would do just fine.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Conclusions
- InhaltsvorschauThe real strength of this API, and, I suspect, of any good API, is that it guides one's thinking. All operations involving tree modification in Subversion are now shaped roughly the same way. This not only reduces the amount of time newcomers must spend learning existing code, it gives new code a clear model to follow, and developers have taken the hint. For example, the svnsync feature, which mirrors one repository's activity directly into another repository—and was added to Subversion in 2006, six years after the advent of the delta editor—uses the delta editor interface to transmit the activity. The developer of that feature was not only spared the need to design a change-transmission mechanism, he was spared the need to even consider whether he needed to design a change-transmission mechanism. And others who now hack on the new code find it feels mostly familiar the first time they see it.These are significant benefits. Not only does having the right API reduce learning time, it also relieves the development community of the need to have certain debates: design discussions that would have spawned long and contentious mailing list threads simply do not come up. That may not be quite the same thing as pure technical or aesthetic beauty, but in a project with many participants and a constant turnover rate, it's a beauty you can use.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 3: The Most Beautiful Code I Never Wrote
- InhaltsvorschauJon BentleyI once heard a master programmer praised with the phrase, "He adds function by deleting code." Antoine de Saint-Exupéry, the French writer and aviator, expressed this sentiment more generally when he said, "A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away." In software, the most beautiful code, the most beautiful functions, and the most beautiful programs are sometimes not there at all.It is, of course, difficult to talk about things that aren't there. This chapter attempts this daunting task by presenting a novel analysis of the runtime of the classic Quicksort program. The first section sets the stage by reviewing Quicksort from a personal perspective. The subsequent section is the meat of this chapter. We'll start by adding one counter to the program, then manipulate the code to make it smaller and smaller and yet more and more powerful until just a few lines of code completely capture its average runtime. The third section summarizes the techniques, and presents a particularly succinct analysis of the cost of binary search trees. The final two sections draw insights from the chapter to help you write more elegant programs.When Greg Wilson first described the idea of this book, I asked myself what was the most beautiful code I had ever written. After this delicious question rolled around my brain for the better part of a day, I realized that the answer was easy: Quicksort. Unfortunately, the one question has three different answers, depending on precisely how it is phrased.I wrote my thesis on divide-and-conquer algorithms, and found that C.A.R. Hoare's Quicksort ("Quicksort," Computer Journal 5) is undeniably the granddaddy of them all. It is a beautiful algorithm for a fundamental problem that can be implemented in elegant code. I loved the algorithm, but I always tiptoed around its innermost loop. I once spent two days debugging a complex program that was based on that loop, and for years I carefully copied that code whenever I needed to perform a similar task. It solved my problems, but I didn'tEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- The Most Beautiful Code I Ever Wrote
- InhaltsvorschauWhen Greg Wilson first described the idea of this book, I asked myself what was the most beautiful code I had ever written. After this delicious question rolled around my brain for the better part of a day, I realized that the answer was easy: Quicksort. Unfortunately, the one question has three different answers, depending on precisely how it is phrased.I wrote my thesis on divide-and-conquer algorithms, and found that C.A.R. Hoare's Quicksort ("Quicksort," Computer Journal 5) is undeniably the granddaddy of them all. It is a beautiful algorithm for a fundamental problem that can be implemented in elegant code. I loved the algorithm, but I always tiptoed around its innermost loop. I once spent two days debugging a complex program that was based on that loop, and for years I carefully copied that code whenever I needed to perform a similar task. It solved my problems, but I didn't really understand it.I eventually learned an elegant partitioning scheme from Nico Lomuto, and was finally able to write a Quicksort that I could understand and even prove correct. William Strunk Jr.'s observation that "vigorous writing is concise" applies to code as well as to English, so I followed his admonition to "omit needless words" (The Elements of Style). I finally reduced approximately 40 lines of code to an even dozen. So if the question is, "What is the most beautiful small piece of code that you've ever written?" my answer is the Quicksort from my book Programming Pearls, Second Edition (Addison-Wesley). This Quicksort function, implemented in C, is shown in . We'll further study and refine this example in the next section.Example 3-1. Quicksort function
voidquicksort(int l, int u) { int i, m; if (l >= u) return; swap(l, randint(l, u)); m = l; for (i = l+1; i <= u; i++) if (x[i] < x[l]) swap(++m, i); swap(l, m); quicksort(l, m-1); quicksort(m+1, u); }Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - More and More with Less and Less
- InhaltsvorschauQuicksort is an elegant algorithm that lends itself to subtle analysis. Around 1980, I had a wonderful discussion with Tony Hoare about the history of his algorithm. He told me that when he first developed Quicksort, he thought it was too simple to publish, and only wrote his classic "Quicksort" paper after he was able to analyze its expected runtime.It is easy to see that in the worst case, Quicksort might take about n2 time to sort an array of n elements. In the best case, it chooses the median value as a partitioning element, and therefore sorts an array in about n lg n comparisons. So, how many comparisons does it use on the average for a random array of n distinct values?Hoare's analysis of this question is beautiful, but unfortunately over the mathematical heads of many programmers. When I taught Quicksort to undergraduates, I was frustrated that many just didn't "get" the proof, even after sincere effort. We'll now attack that problem experimentally. We'll start with Hoare's program, and eventually end up with an analysis close to his.Our task is to modify of the randomizing Quicksort code to analyze the average number of comparisons used to sort an array of distinct inputs. We will also attempt to gain maximum insight with minimal code, runtime, and space.To determine the average number of comparisons, we first augment the program to count them. To do this, we increment the variable
compsbefore the comparison in the inner loop ().Example 3-2. Quicksort inner loop instrumented to count comparisonsfor (i = l+1; i <= u; i++) { comps++; if (x[i] < x[l]) swap(++m, i); }If we run the program for one value of n, we'll see how many comparisons that particular run takes. If we repeat that for many runs over many values of n, and analyze the results statistically, we'll observe that, on average, Quicksort takes about 1.4Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Perspective
- Inhaltsvorschausummarizes the programs used to analyze Quicksort throughout this chapter.
Table 3-2: Evolution of Quicksort comparison counting Example NumberLines of codeType of answerNumber of answerRuntimeSpace213Sample1nlg nN313""""48""nlgn58""""66""""76Exact"3NN86"NN2N96""""106""""114""N"124ExactNN1Each individual step in the evolution of our code was pretty straightforward; the transition from the sample in to the exact answer in is probably the most subtle. Along the way, as the code became faster and more useful, it also shrank in size. In the middle of the 19th century, Robert Browning observed that "less is more," and this table helps to quantify one instance of that minimalist philosophy.We have seen three fundamentally different types of programs. and are working Quicksorts, instrumented to count comparisons as they sort a real array. through implement a simple model of Quicksort: they mimic one run of the algorithm, without actually doing the work of sorting. through implement a more sophisticated model: they compute the true average number of comparisons without ever tracing any particular run.The techniques used to achieve each program are summarized as follows:-
, , : Fundamental change of problem definition.
-
, , : Slight change of function definition.
-
: New data structure to implement dynamic programming.
These techniques are typical. We can often simplify a program by asking, "What problem do we really need to solve?" or, "Is there a better function to solve that problem?"When I presented this analysis to undergraduates, the program finally shrank to zero lines of code and disappeared in a puff of mathematical smoke. We can reinterpret as the following recurrence relation:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- What Is Writing?
- InhaltsvorschauIn a weak sense, I "wrote" through of the program. I wrote them first in scribbled notes, then on a chalkboard in front of undergraduates, and eventually in this chapter. I derived the programs systematically, I have spent considerable time analyzing them, and I believe that they are correct. Apart from the spreadsheet implementation of , though, I have never run any of the examples as a computer program.In almost two decades at Bell Labs, I learned from many teachers (and especially from Brian Kernighan, whose chapter on the teaching of programming appears as of this book) that "writing" a program to be displayed in public involves much more than typing symbols. One implements the program in code, runs it first on a few test cases, then builds thorough scaffolding, drivers, and a library of cases to beat on it systematically. Ideally, one mechanically includes the compiled source code into the text without human intervention. I wrote (and all the code in Programming Pearls) in that strong sense.As a point of honor, I wanted to keep my title honest by never implementing through . Almost four decades of computer programming have left me with deep respect for the difficulty of the craft (well, more precisely, abject fear of bugs). I compromised by implementing in a spreadsheet, and I tossed in an additional column that gave the closed-form solution. Imagine my delight (and relief) when the two matched exactly! And so I offer the world these beautiful unwritten programs, with some confidence in their correctness, yet painfully aware of the possibility of undiscovered error. I hope that the deep beauty I find in them will be unmarred by superficial blemishes.In my discomfort at presenting these unwritten programs, I take consolation from the insight of Alan Perlis, who said, "Is it possible that software is not like anything else, that it is meant to be discarded: that the whole point is to see it as a soap bubble?"Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Conclusion
- InhaltsvorschauBeauty has many sources. This chapter has concentrated on the beauty conferred by simplicity, elegance, and concision. The following aphorisms all express this overarching theme:
-
Strive to add function by deleting code.
-
A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away. (Saint-Exupéry)
-
In software, the most beautiful code, the most beautiful functions, and the most beautiful programs are sometimes not there at all.
-
Vigorous writing is concise. Omit needless words. (Strunk and White)
-
The cheapest, fastest, and most reliable components of a computer system are those that aren't there. (Bell)
-
Endeavor to do more and more with less and less.
-
If I had more time, I would have written you a shorter letter. (Pascal)
-
The Inventor's Paradox: The more ambitious plan may have more chance of success. (Pólya)
-
Simplicity does not precede complexity, but follows it. (Perlis)
-
Less is more. (Browning)
-
Make everything as simple as possible, but no simpler. (Einstein)
-
Software should sometimes be seen as a soap bubble. (Perlis)
-
Seek beauty through simplicity.
Here endeth the lesson. Go thou and do likewise.For those who desire more concrete hints, here are some ideas grouped into three main categories.- Analysis of programs
-
One way to gain insight into the behavior of a program is to instrument it and then run it on representative data, as in . Often, though, we are less concerned with the program as a whole than with individual aspects. In this case, for instance, we considered only the number of comparisons that Quicksort uses on the average and ignored many other aspects. Sedgewick ("The analysis of Quicksort programs,"
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Acknowledgments
- InhaltsvorschauI am grateful for the insightful comments of Dan Bentley, Brian Kernighan, Andy Oram, and David Weiss.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 4: Finding Things
- InhaltsvorschauTim BrayComputers can compute, but that's not what people use them for, mostly. Mostly, computers store and retrieve information. Retrieve implies find, and in the time since the advent of the Web, search has become a dominant application for people using computers.As data volumes continue to grow—both absolutely, and relative to the number of people or computers or anything, really—search becomes an increasingly large part of the life of the programmer as well. A few applications lack the need to locate the right morsel in some information store, but very few.The subject of search is one of the largest in computer science, and thus I won't try to survey all of it or discuss the mechanics; in fact, I'll only consider one simple search technique in depth. Instead, I'll focus on the trade-offs that go into selecting search techniques, which can be subtle.You really can't talk about search without talking about time. There are two different flavors of time that apply to problems of search. The first is the time it takes the search to run, which is experienced by the user who may well be staring at a message saying something like "Loading…". The second is the time invested by the programmer who builds the search function, and by the programmer's management and customers waiting to use the program.Let's look at a sample problem to get a feel for how a search works in real life. I have a directory containing logfiles from my weblog (http://www.tbray.org/ongoing) from early 2003 to late 2006; as of the writing of this chapter, they recorded 140,070,104 transactions and occupied 28,489,788,532 bytes (uncompressed). All these statistics, properly searched, can answer lots of questions about my traffic and readership.Let's look at a simple question first: which articles have been read the most? It may not be instantly obvious that this problem is about search, but it is. First of all, you have to search through the logfiles to find the lines that record someone fetching an article. Second, you have to search through those lines to find the name of the article they fetched. Third, you have to keep track, for each article, of how often it was fetched.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- On Time
- InhaltsvorschauYou really can't talk about search without talking about time. There are two different flavors of time that apply to problems of search. The first is the time it takes the search to run, which is experienced by the user who may well be staring at a message saying something like "Loading…". The second is the time invested by the programmer who builds the search function, and by the programmer's management and customers waiting to use the program.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Problem: Weblog Data
- InhaltsvorschauLet's look at a sample problem to get a feel for how a search works in real life. I have a directory containing logfiles from my weblog (http://www.tbray.org/ongoing) from early 2003 to late 2006; as of the writing of this chapter, they recorded 140,070,104 transactions and occupied 28,489,788,532 bytes (uncompressed). All these statistics, properly searched, can answer lots of questions about my traffic and readership.Let's look at a simple question first: which articles have been read the most? It may not be instantly obvious that this problem is about search, but it is. First of all, you have to search through the logfiles to find the lines that record someone fetching an article. Second, you have to search through those lines to find the name of the article they fetched. Third, you have to keep track, for each article, of how often it was fetched.Here is an example of one line from one of these files, which wraps to fit the page in this book, but is a single long line in the file:
c80-216-32-218.cm-upc.chello.se - - [08/Oct/2006:06:37:48 -0700] "GET /ongoing/When/ 200x/2006/10/08/Grief-Lessons HTTP/1.1" 200 5945 "http://www.tbray.org/ongoing/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)
Reading from left to right, this tells us that:Somebody from an organization namedchelloin Sweden,
who provided neither a username nor a password,
contacted my weblog early in the morning of October 8, 2006 (my server's time zone is seven hours off Greenwich),
and requested a resource named /ongoing/When/200x/2006/10/08/Grief-Lessons
using the HTTP 1.1 protocol;
the request was successful and returned 5,945 bytes;
the visitor had been referred from my blog's home page,
and was using Internet Explorer 6 running on Windows XP.
This is an example of the kind of line I want: one that records the actual fetch of an article. There are lots of other lines that record fetching stylesheets, scripts, pictures, and so on, and attacks by malicious users. You can spot the kind of line I want by the fact that the article's name starts withEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Problem: Who Fetched What, When?
- InhaltsvorschauRunning a couple of quick scripts over the logfile data reveals that there are 12,600,064 instances of an article fetch coming from 2,345,571 different hosts. Suppose we are interested in who was fetching what, and when? An auditor, a police officer, or a marketing professional might be interested.So, here's the problem: given a hostname, report what articles were fetched from that host, and when. The result is a list; if the list is empty, no articles were fetched.We've already seen that a language's built-in hash or equivalent data structure gives the programmer a quick and easy way to store and look up key/value pairs. So, you might ask, why not use it?That's an excellent question, and we should give the idea a try. There are reasons to worry that it might not work very well, so in the back of our minds, we should be thinking of a Plan B. As you may recall if you've ever studied hash tables, in order to go fast, they need to have a small load factor; in other words, they need to be mostly empty. However, a hash table that holds 2.35 million entries and is still mostly empty is going to require the use of a whole lot of memory.To simplify things, I wrote a program that ran over all the logfiles and pulled out all the article fetches into a simple file; each line has the hostname, the time of the transaction, and the article name. Here are the first few lines:
crawl-66-249-72-77.googlebot.com 1166406026 2003/04/08/Riffs egspd42470.ask.com 1166406027 2006/05/03/MARS-T-Shirt 84.7.249.205 1166406040 2003/03/27/Scanner
(The second field, the 10-digit number, is the standard Unix/Linux representation of time as the number of seconds since the beginning of 1970.)Then I wrote a simple program to read this file and load a great big hash. shows the program.Example 4-5. Loading a big hash1 class BigHash 2 3 def initialize(file) 4 @hash = {} 5 lines = 0 6 File.open(file).each_line do |line| 7 s = line.split 8 article = s[2].intern 9 if @hash[s[0]] 10 @hash[s[0]] << [ s[1], article ] 11 else 12 @hash[s[0]] = [ s[1], article ] 13 end 14 lines += 1 15 STDERR.puts "Line: #{lines}" if (lines % 100000) == 0 16 end 17 end 18 19 def find(key) 20 @hash[key] 21 end 22 23 endEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Search in the Large
- InhaltsvorschauWhen most people think of search they think of web search, as offered by Yahoo!, Google, and their competitors. While ubiquitous web search is a new thing, the discipline of full-text search upon which it is based is not. Most of the seminal papers were written by Gerald Salton at Cornell as far back as the early 1960s. The basic techniques for indexing and searching large volumes of text have not changed dramatically since then. What has changed is how result ranking is done.The standard approach to full-text search is based on the notion of a posting, which is a small, fixed-size record. To build an index, you read all the documents and, for each word, create a posting that says word x appears in document y at position z. Then you sort all the words together, so that for each unique word you have a list of postings, each a pair of numbers consisting of a document ID and the text's offset in that document.Because postings are small and fixed in size, and because you tend to have a huge number of them, a natural approach is to use binary search. I have no idea of the details of how Google or Yahoo! do things, but I'd be really unsurprised to hear that those tens of thousands of computers spend a whole lot of their time binary-searching big arrays of postings.People who are knowledgeable about search shared a collective snicker a few years ago when the number of documents Google advertised as searching, after having been stuck at two billion and change for some years, suddenly became much larger and then kept growing. Presumably they had switched the document ID in all those postings from 32-bit to 64-bit numbers.Given a word, searching a list of postings to figure out which documents contain it is not rocket science. A little thought shows that combining the lists to do AND and OR queries and phrase search is also simple, conceptually at least. What's hard is sorting the result list so that the good results show up near the top. Computer science has a subdiscipline called Information Retrieval (IR for short) that focuses almost entirely on this problem. Historically, the results had been very poor, up until recently.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Conclusion
- InhaltsvorschauIt is hard to imagine any computer application that does not involve storing data and finding it based on its content. The world's single most popular computer application, web search, is a notable example.This chapter has considered some of the issues, notably bypassing the traditional "database" domain and the world of search strategies that involve external storage. Whether operating at the level of a single line of text or billions of web documents, search is central. From the programmer's point of view, it also needs to be said that implementing searches of one kind or another is, among other things, fun.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 5: Correct, Beautiful, Fast (in That Order): Lessons from Designing XML Verifiers
- InhaltsvorschauElliotte Rusty HaroldThis is the story of two routines that perform input verification for XML, the first in JDOM, and the second in XOM. I was intimately involved in the development of both, and while the two code bases are completely separate and share no common code, the ideas from the first clearly trickled into the second. The code, in my opinion, gradually became more beautiful. It certainly became faster.Speed was the driving factor in each successive refinement, but in this case the improvements in speed were accompanied by improvements in beauty as well. I hope to dispel the myth that fast code must be illegible, ugly code. On the contrary, I believe that more often than not, improvements in beauty lead to improvements in execution speed, especially taking into account the impact of modern optimizing compilers, just-in-time compilers, RISC (reduced instruction set computer) architectures, and multi-core CPUs.XML achieves interoperability by rigorously enforcing certain rules about what may and may not appear in an XML document. With a few very small exceptions, a conforming processor can process any well-formed XML document and can identify (and not attempt to process) malformed documents. This ensures a high degree of interoperability between platforms, parsers, and programming languages. You don't have to worry that your parser won't read my document because yours was written in C and runs on Unix, while mine was written in Java and runs on Windows.Fully maintaining XML correctness normally involves two redundant checks on the data:
-
Validation occurs on input. As a parser reads an XML document, it checks the document for well-formedness and, optionally, validity. Well-formedness checks purely syntactic constraints, such as whether every start tag has a matching end tag. This is required of all XML parsers. Validity means that only elements and attributes specifically listed in a Document Type Definition (DTD) appear, and only in the proper positions.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- The Role of XML Validation
- InhaltsvorschauXML achieves interoperability by rigorously enforcing certain rules about what may and may not appear in an XML document. With a few very small exceptions, a conforming processor can process any well-formed XML document and can identify (and not attempt to process) malformed documents. This ensures a high degree of interoperability between platforms, parsers, and programming languages. You don't have to worry that your parser won't read my document because yours was written in C and runs on Unix, while mine was written in Java and runs on Windows.Fully maintaining XML correctness normally involves two redundant checks on the data:
-
Validation occurs on input. As a parser reads an XML document, it checks the document for well-formedness and, optionally, validity. Well-formedness checks purely syntactic constraints, such as whether every start tag has a matching end tag. This is required of all XML parsers. Validity means that only elements and attributes specifically listed in a Document Type Definition (DTD) appear, and only in the proper positions.
-
Verification happens on output. When generating an XML document through an XML API such as DOM, JDOM, or XOM, the parser checks all strings passing through the API to make sure they're legal in XML.
While input validation is more thoroughly defined by the XML specification, output verification can be equally important. In particular, it is critical for debugging and making sure that the code is correct.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- The Problem
- InhaltsvorschauThe very first beta releases of JDOM did not verify the strings used to create element names, text content, or pretty much anything else. Programs were free to generate element names that contained whitespace, comments that ended in hyphens, text nodes that contained nulls, and other malformed content. Maintaining the correctness of the generated XML was completely left up to the client programmer.This bothered me. While XML is simpler than some alternatives, it is not simple enough that it can be fully understood without immersing yourself in specification arcana, such as exactly which Unicode code points are or are not legal in XML names and text content.JDOM aimed to be an API that brought XML to the masses. JDOM aimed to be an API that, unlike DOM, did not require a two-week course and an expensive expert mentor to learn to use properly. To enable this, JDOM needed to lift as much of the burden of understanding XML from the programmer as possible. Properly implemented, JDOM would keep the programmer from making mistakes.There are numerous ways JDOM could do this. Some of them fell out as a direct result of its data model. For instance, in JDOM it is not possible to overlap elements (
<p>Sally said, <quote>let's go the park.</p>. Then let's play ball.</quote>). Because JDOM's internal representation is a tree, there's simply no way to generate this markup from JDOM. However, a number of other constraints need to be checked explicitly, such as whether:-
The name of an element, attribute, or processing instruction is a legal XML name
-
Local names do not contain colons
-
Attribute namespaces do not conflict with the namespaces of their parent element or sibling attributes
-
Every Unicode surrogate character appears as part of a surrogate pair consisting of one high surrogate followed by one low surrogate
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Version 1: The Naïve Implementation
- InhaltsvorschauMy initial contribution to JDOM (shown in ) simply deferred the rule checks to Java's
Characterclass. ThecheckXMLNamemethod returns an error message if an XML name is invalid, and null if it's valid. This itself is a questionable design; it should probably throw an exception if the name is invalid, and return void in all other cases. Later in this chapter, you'll see how future versions addressed this.Example 5-2. The first version of name character verificationprivate static String checkXMLName(String name) { // Cannot be empty or null if ((name == null) || (name.length() == 0) || (name.trim().equals(""))) { return "XML names cannot be null or empty"; } // Cannot start with a number char first = name.charAt(0); if (Character.isDigit(first)) { return "XML names cannot begin with a number."; } // Cannot start with a $ if (first == '$') { return "XML names cannot begin with a dollar sign ($)."; } // Cannot start with a _ if (first == '-') { return "XML names cannot begin with a hyphen (-)."; } // Ensure valid content for (int i=0, len = name.length(); i<len; i++) { char c = name.charAt(i); if ((!Character.isLetterOrDigit(c)) && (c != '-') && (c != '$') && (c != '_')) { return c + " is not allowed in XML names."; } } // We got here, so everything is OK return null; }This method was straightforward and easy to understand. Unfortunately, it was wrong. In particular:-
It allowed names that contained colons. Because JDOM attempted to maintain namespace well-formedness, this had to be fixed.
-
The Java
Character.isLetterOrDigitandCharacter.isDigitmethods aren't perfectly aligned with XML's definition of letters and digits. Java considers some characters as letters that XML doesn't, and vice versa.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Version 2: Imitating the BNF Grammar O(N)
- InhaltsvorschauMy next contribution to JDOM manually translated the BNF productions into a series of
if-elsestatements. The result looked like . You'll notice that this version is quite a bit more complicated.Example 5-3. BNF-based name character verificationprivate static String checkXMLName(String name) { // Cannot be empty or null if ((name == null) || (name.length() == 0) || (name.trim().equals(""))) { return "XML names cannot be null or empty"; } // Cannot start with a number char first = name.charAt(0); if (!isXMLNameStartCharacter(first)) { return "XML names cannot begin with the character \"" + first + "\""; } // Ensure valid content for (int i=0, len = name.length(); i<len; i++) { char c = name.charAt(i); if (!isXMLNameCharacter(c)) { return "XML names cannot contain the character \"" + c + "\""; } } // We got here, so everything is OK return null; } public static boolean isXMLNameCharacter(char c) { return (isXMLLetter(c) || isXMLDigit(c) || c == '.' || c == '-' || c == '_' || c == ':' || isXMLCombiningChar(c) || isXMLExtender(c)); } public static boolean isXMLNameStartCharacter(char c) { return (isXMLLetter(c) || c == '_' || c ==':'); }Instead of simply reusing Java'sCharacter.isLetterOrDigitandCharacter.isDigitmethods, thecheckXMLNamemethod in delegates the checks toisXMLNameCharacterandisXMLNameStartCharacter. These methods further delegate to methods matching the other BNF productions for the different types of characters: letters, digits, combining characters, and extenders. shows one of these methods,isXMLDigit. Notice that this method considers not only the ASCII digits, but also the other digit characters included in Unicode 2.0. TheisXMLLetter, isXMLCombiningChar, andisXMLExtendermethods follow the same pattern. They're just longer.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Version 3: First Optimization O(log N)
- InhaltsvorschauAs Donald Knuth once said, "Premature optimization is the root of all evil in programming." However, although optimization matters less often than programmers think, it does matter; and this was one of the minority cases where it matters.Profiling proved that JDOM was spending a significant chunk of time performing verification. Every name character required several checks, and JDOM recognized a nonname character only after checking it first against every possible name character. Consequently, the number of checks increased in direct proportion to the code point value. The project maintainers were beginning to grumble that maybe verification wasn't so important after all, and they might make it optional or ditch it entirely. Now, personally, I'm not willing to compromise correctness in the name of faster code, but it was apparent that the decision was going to be taken out of my hands if someone didn't do something. Fortunately, Jason Hunter did.Hunter restructured my naïve code in a very clever way, shown in . Previously, even the common case where a character was legal required over 100 tests for each of the possible ranges of illegal characters. Hunter noticed that we could return a true value much sooner if we recognized both legal and illegal characters. This is especially beneficial in the common case where all names and content are ASCII, because these characters are the first ones we test.Example 5-5. Optimized digit character verification
public static booleanisXMLDigit(char c) { if (c < 0x0030) return false; if (c <= 0x0039) return true; if (c < 0x0660) return false; if (c <= 0x0669) return true; if (c < 0x06F0) return false; if (c <= 0x06F9) return true; if (c < 0x0966) return false; if (c <= 0x096F) return true; if (c < 0x09E6) return false; if (c <= 0x09EF) return true; if (c < 0x0A66) return false; if (c <= 0x0A6F) return true; if (c < 0x0AE6) return false; if (c <= 0x0AEF) return true; if (c < 0x0B66) return false; if (c <= 0x0B6F) return true; if (c < 0x0BE7) return false; if (c <= 0x0BEF) return true; if (c < 0x0C66) return false; if (c <= 0x0C6F) return true; if (c < 0x0CE6) return false; if (c <= 0x0CEF) return true; if (c < 0x0D66) return false; if (c <= 0x0D6F) return true; if (c < 0x0E50) return false; if (c <= 0x0E59) return true; if (c < 0x0ED0) return false; if (c <= 0x0ED9) return true; if (c < 0x0F20) return false; if (c <= 0x0F29) return true; return false; }Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Version 4: Second Optimization: Don't Check Twice
- InhaltsvorschauAt this point, the time spent on verification had dropped by about a factor of four, and was no longer a huge concern. Version 3 is essentially what shipped in JDOM 1.0. However, by this point I had decided that JDOM was not good enough, and suspected that I could do better. My defection had more to do with issues of API design than with performance. I was also concerned with correctness, since JDOM still wasn't verifying everything it could, and it was still possible (though difficult) to use JDOM to create malformed documents. Consequently, I embarked on XOM.XOM, unlike JDOM, made no compromises on correctness in the name of performance. The rule in XOM was that correctness came first, always. Nonetheless, for people to choose XOM over JDOM, its performance was going to have to be comparable to or better than that of JDOM. Thus, it was time to take another whack at the verification problem.The optimization efforts of JDOM version 3 had improved the performance of the
checkXMLNamemethod, but I hoped to eliminate it completely in this next optimization. The reason for this is that you don't always need to check the XML input if it's coming from a known good source. In particular, an XML parser carries out many of the necessary checks before the input reaches the XML verifier, and there's no reason to duplicate this work. Because the constructors always checked for correctness, they caused a real drain on parsing speed performance, which in practice was a large fraction (often a substantial majority) of the time an application spent on each document.The use of separate paths for separate types of input would resolve this issue. I had determined that constructors should not verify the element names when creating an object from strings that the parser had already read and checked in the document. Conversely, constructors should verify the element names when creating an object from strings passed by the library client. Clearly, two different constructors were needed; one for the parser and one for everybody else.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Version 5: Third Optimization O(1)
- InhaltsvorschauAfter I implemented the constructor detailed in the previous section and added some additional optimizations, XOM was fast enough for anything I needed to do. Read performance was essentially limited only by parser speed and there were very few bottlenecks left in the document-building process.However, other users with different use cases were encountering different problems. In particular, some users were writing custom builders that read non-XML formats into a XOM tree. They were not using an XML parser, and therefore were not able to take the shortcut that bypassed name verification. These users were still seeing verification as a hot spot, albeit a smaller one than it had been.I wasn't willing to turn off verification completely, despite requests to do so. However, it was obvious that the verification process had to be sped up. The approach I took is an old optimization classic: table lookup. In a table lookup, you create a table that contains all the answers for all the known inputs. When given any input, the compiler can simply look up the answer in the table, without having to perform a calculation. This is an O(1) operation, and its performance speed is close to the theoretical maximum. Of course, the devil is in the details.The simplest way to implement table lookup in Java is with a
switchstatement. javac com-piles this statement into a table of values stored in the byte code. Depending on theswitchstatement cases, the compiler creates one of two byte code instructions. If the cases are contiguous (e.g., 72–189 without skipping any values in between) the compiler uses a more efficienttableswitch. However, if any values are skipped, the compiler uses the more indirect and less efficientlookupswitchinstruction instead.This behavior isn't absolutely guaranteed, and may perhaps not even be true in more recent virtual machines (VMs), but it certainly was true in the generation of VMs I tested and inspectedEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Version 6: Fourth Optimization: Caching
- InhaltsvorschauIf you can't make the verification go any faster, the only remaining option is not to do it, or at least not do so much of it. This approach was suggested by Wolfgang Hoschek. He noticed that in an XML document the same names keep coming up over and over. For instance, in an XHTML document, there are only about 100 different element names, and a few dozen of those account for most elements (
p, table, div, span, strong, and so on). Once you've verified that a name is legal, you can store it in a collection somewhere. The next time you see a name, you first check to see whether it's one of the names you've seen before; if it is, you just accept it and don't check it again.However, you do have to be very careful here. It may take longer to find some names (especially shorter ones) in a collection such as a hash map than it would take to check them all over again. The only way to tell is to benchmark and profile the caching scheme very carefully on several different VMs using different kinds of documents. You may need to tune parameters such as the size of the collection to fit different kinds of documents, and what works well for one document type may not work well for another. Furthermore, if the cache is shared between threads, thread contention can become a serious problem.Consequently, I have not yet implemented this scheme for element names. However, I have implemented it for namespace URIs (Uniform Resource Identifiers), which have even more expensive verification checks than do element names, and which are even more repetitive. For instance, many documents have only one namespace URI, and very few have more than four, so the potential gain here is much larger. shows the inner class that XOM uses to cache namespace URIs after it has verified them.Example 5-11. A cache for verified namespace URIsprivate final static class URICache { private final static int LOAD = 6; private String[] cache = new String[LOAD]; private int position = 0; synchronized boolean contains(String s) { for (int i = 0; i < LOAD; i++) { // Here I'm assuming the namespace URIs are interned. // This is commonly but not always true. This won't // break if they haven't been. Using equals() instead // of == is faster when the namespace URIs haven't been // interned but slower if they have. if (s == cache[i]) { return true; } } return false; } synchronized void put(String s) { cache[position] = s; position++; if (position == LOAD) position = 0; } }Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - The Moral of the Story
- InhaltsvorschauIf there's a moral to this story, it is this: do not let performance considerations stop you from doing what is right. You can always make the code faster with a little cleverness. You can rarely recover so easily from a bad design.Programs usually become faster over time, not slower. Faster CPUs are a part of that process, but not the most important. Improved algorithms are a much bigger contributor. Therefore, design the program you want in the way it should be designed. Then, and only then, should you worry about performance. More often than not, you'll discover the program is fast enough on your first pass, and you won't even need to play tricks like those outlined here. However, when you do, it's much easier to make beautiful-but-slow code fast than it is to make fast-but-ugly code beautiful.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 6: Framework for Integrated Test: Beauty Through Fragility
- InhaltsvorschauMichael FeathersI have some ideas about what good design is. Every programmer does. We all develop these ideas through practice, and we draw on them as we work. If we're tempted to use a public variable in a class, we remember that public variables are usually a symptom of bad design, and if we see implementation inheritance, we remember that we should prefer delegation to inheritance.Rules like these are useful. They help us move our way through the design space as we work, but we do ourselves a disservice if we forget that they are just rules of thumb. If we forget, we can end up with design where we are "doing everything" right, but we still miss the mark.These thoughts were driven home to me back in 2002 when Ward Cunningham released Framework for Integrated Test (FIT), his automated testing framework. FIT consists of a small set of elegant Java classes. They maneuver in a path around nearly every rule of thumb about design in the Java community, and each little turn that they make is compelling. They stand in stark contrast to design that just follows the rules.To me, FIT is beautiful code. It's an invitation to think about the contextual nature of design.In this chapter, I'll walk through one of the earliest released versions of FIT. I'll show how FIT deviates from much of the current accepted wisdom of Java and OO framework development, and describe how FIT challenged me to reconsider some of my deeply held preconceptions about design. I don't know whether you'll reconsider yours after reading this chapter, but I invite you to look just the same. With luck, I'll be able to express what makes FIT's design special.FIT is relatively simple to explain. It's a little framework that lets you write executable application tests in HTML tables. Each type of table is processed by a programmer-defined class called a fixture. When the framework processes a page of HTML, it creates a fixture object for each table in the page. The fixture uses the table as input to validation code of your choice: it reads cell values, communicates with your application, checks expected values, and marks cells green or red to indicate success or failure of a check.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- An Acceptance Testing Framework in Three Classes
- InhaltsvorschauFIT is relatively simple to explain. It's a little framework that lets you write executable application tests in HTML tables. Each type of table is processed by a programmer-defined class called a fixture. When the framework processes a page of HTML, it creates a fixture object for each table in the page. The fixture uses the table as input to validation code of your choice: it reads cell values, communicates with your application, checks expected values, and marks cells green or red to indicate success or failure of a check.The first cell in the table specifies the name of the fixture class that will be used to process the table. For instance, shows a table that will be processed by the
MarketEvaluationfixture. shows the same table after FIT has processed it; onscreen, the shaded cells would be red to show a validation failure.
Figure 6-1: HTML table displayed before FIT processing
Figure 6-2: HTML table displayed after FIT processingThe key idea behind FIT is that documents can serve as tests. You could, for instance, embed tables in a requirements document and run the document through FIT to see whether the behavior specified in those tables exists in your software. These documents with tables can be written directly in HTML, or they can be written in Microsoft Word or any other application that can save documents as HTML. Because a FIT fixture is just a piece of software, it can call any portion of an application you care to test, and make those calls at any level. It's all under your control as a programmer.I won't spend any more time explaining FIT and its problem domain; there's more information on the FIT web site (http://fit.c2.com). But I do want to describe the design of FIT and some of the interesting choices it embodies.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - The Challenge of Framework Design
- InhaltsvorschauOK, that's the architecture of FIT. Let's talk about what makes it different.Framework design is tough. The hardest thing about it is that you have no control over code that uses your framework. Once you've published your framework's API, you can't change the signatures of particular methods or change their semantics without forcing some work on the users of the framework. And, typically, framework users do not like to revisit their code when they upgrade; they want complete backward-compatibility.The traditional way of handling this problem is to adopt certain rules of thumb:
-
Be very careful about what you make public: control visibility to keep the published parts of the framework small.
-
Use interfaces.
-
Provide well-defined "hook points" that permit extensibility in the places where you intend it to occur.
-
Prevent extension in places where you don't want it to occur.
Some of these guidelines have been written down in various places, but, for the most part, they are cultural. They are just common ways that framework developers design.Let's look at an example outside of FIT that shows those rules in action: the JavaMail API.If you want to receive mail using JavaMail, you have to get a reference to aSessionobject.Sessionis a class that has a static method namedgetDefaultInstance:package javax.mail; public final class Session { ... public static Session getDefaultInstance(Properties props); ... public Store getStore( ) throws NoSuchProviderException; ... }TheSessionclass is final. In Java, this means that we cannot create subclasses ofSession. Furthermore,Sessiondoesn't have a public constructor, so we can't create an instance ourselves; we have to use that static method to get an instance.Once we have a session, we can use theEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- An Open Framework
- InhaltsvorschauFIT is a very different kind of framework. You might have noticed from the UML diagram in that nearly everything in the core framework is public, even the data. If you glance though the code of other popular frameworks, you'll find nothing quite like it. How could this possibly work? It works because FIT is an example of an open framework. It doesn't have a small set of designed extension points; the entire framework was designed to be extensible.Let's take a look at the
Fixtureclass. Clients of theFixtureclass use it in a very direct way. They create an instance ofFixture, create a parse tree for an HTML document usingParse, and then pass the tree to thedoTablesmethod. ThedoTablesmethod then callsdoTablefor each table in the document, passing along the appropriate parse subtree.ThedoTablemethod looks like this:public void doTable(Parse table) { doRows(table.parts.more); }And the method it calls,doRows, looks like this:public voiddoRows(Parse rows) { while (rows != null) { doRow(rows); rows = rows.more; } }ThedoRowmethod, in turn, callsdoCells:public void doRow(Parse row) {doCells(row.parts); }And the sequence bottoms out in a method calleddoCell:public void doCell(Parse cell, int columnNumber) {ignore(cell); }Theignoremethod simply adds the color gray to the cell, indicating that the cell has been ignored:public static void ignore (Parse cell) { cell.addToTag(" bgcolor=\"#efefef\""); ignores++; }As defined, it doesn't look likeFixturedoes much of anything at all. All it does is traverse a document and turn cells gray. However, subclasses ofFixturecan override any of those methods and do different things. They can gather information, save information, communicate with the application under test, and mark cell values.Fixturedefines the default sequence for traversing an HMTL document.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - How Simple Can an HTML Parser Be?
- InhaltsvorschauIn addition to being an open framework, FIT presents some other surprising design choices. Earlier, I mentioned that all of FIT's HTML parsing is done by the
Parseclass. One of the things that I love the most about theParseclass is that it constructs an entire tree with its constructors.Here's how it works. You create an instance of the class with a string of HTML as a constructor argument:String input = read(new File(argv[0]);Parse parse = new Parse(input);
TheParseconstructor recursively constructs a tree ofParseinstances, each of which represents a portion of the HTML document. The parsing code is entirely within the constructors ofParse.EachParseinstance has five public strings and two references to otherParseobjects:public String leader; public String tag; public String body; public String end; public String trailer; public Parse more; public Parse parts;
When you construct your firstParsefor an HMTL document, in a sense, you've constructed all of them. From that point on, you can usemoreandpartsto traverse nodes. Here's the parsing code in theParseclass:static String tags[] = {"table", "tr", "td"}; public Parse (String text) throws ParseException { this (text, tags, 0, 0); } public Parse (String text, String tags[]) throws ParseException { this (text, tags, 0, 0); } public Parse (String text, String tags[], int level, int offset) throws ParseException { String lc = text.toLowerCase( ); int startTag = lc.indexOf("<"+tags[level]); int endTag = lc.indexOf(">", startTag) + 1; int startEnd = lc.indexOf("</"+tags[level], endTag); int endEnd = lc.indexOf(">", startEnd) + 1; int startMore = lc.indexOf("<"+tags[level], endEnd); if (startTag<0 || endTag<0 || startEnd<0 || endEnd<0) { throw new ParseException ("Can't find tag: "+tags[level], offset); } leader = text.substring(0,startTag); tag = text.substring(startTag, endTag); body = text.substring(endTag, startEnd); end = text.substring(startEnd,endEnd); trailer = text.substring(endEnd); if (level+1 < tags.length) { parts = newParse (body, tags, level+1, offset+endTag); body = null; } if (startMore>=0) { more = new Parse (trailer, tags, level, offset+endEnd); trailer = null; } }Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Conclusion
- InhaltsvorschauMany of us in the industry have learned our lessons through the school of hard knocks. We've run into situations where software we've written earlier is not as extensible as we wish. Over time, we've reacted by gathering rules of thumb that attempt to preserve extensibility by restricting choices. If you are developing a framework and you have thousands of users, this may be the best thing that you can do, but it isn't the only way.The alternative that FIT demonstrates is radical: try to make the framework as flexible and concise as you can, not by factoring it into dozens of classes but by being very careful about how each class is factored internally. You make methods public so that when users want to stray from the normal course, they can, but you also make some hard choices, like FIT's choice of HTML as a medium.If you design a framework like this, you end with a very different piece of software. Once the classes are opened up, it will be hard to change them in future releases, but what you produce may be small and understandable enough to lead people to create something new.And that isn't a trivial feat. The world is filled with frameworks that are a little too hard to understand and look like they have a bit too much investment in them—so much investment that it's hard to justify reinventing the wheel. When that happens, we end up with large frameworks that miss important areas of use.Putting frameworks aside, there's a lot to say for adopting this "open" style of development in small projects. If you know all of the users of your code—if, for instance, they all sit with you on the same project—you can choose to lower the defenses in some of your code and make it as simple and clean as FIT's code.Not all code is API code. It's easy for us to forget this because we hear the horror stories (the cases when one team locks down another by silently introducing dependencies that can't be eradicated), but if you are one team—if the clients of your code can and will make changes when you ask them to—you can adopt a different style of development and move toward quite open code. Language-based protection and design-based protection are solutions to a problem, but the problem is social, and it's important to consider whether you really have it or not.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 7: Beautiful Tests
- InhaltsvorschauAlberto SavoiaMost programmers have had the experience of looking at a piece of code and thinking it was not only functional but also beautiful. Code is typically considered beautiful if it does what it's supposed to do with unique elegance and economy.But what about the tests for that beautiful code—especially the kind of tests that developers write, or should write, while they are working on the code? In this chapter, I am going to focus on tests, because tests can be beautiful themselves. More importantly, they can play a key role in helping you create more beautiful code.As we will see, a combination of things makes tests beautiful. Unlike code, I can't bring myself to consider any single test beautiful—at least not in the same way I can look at, say, a sorting routine and call it beautiful. The reason is that testing, by its very nature, is a combinatorial and exploratory problem. Every
ifstatement in the code requires at least two tests (one test for when the condition evaluates to true and one when it evaluates to false). Anifstatement with multiple conditions, such as:if ( a || b || c )
could require, in theory, up to eight tests—one for each possible combination of the values ofa, b, andc. Throw in control loops, multiple input parameters, dependencies on external code, different hardware and software platforms, etc., and the number and types of tests needed increases considerably.Any nontrivial code, beautiful or not, needs not one, but a team of tests, where each test should be focused on checking a specific aspect of the code, similar to the way different players on a sports team are responsible for different tasks and different areas of the playing field.Now that we have determined that we should evaluate tests in groups, we need to determine what characteristics would make a group of tests beautiful—an adjective rarely applied to them.Generally speaking, the main purpose of tests is to instill, reinforce, or reconfirm our confidence that the code works properly and efficiently. Therefore, to me, the most beautiful tests are those that help me maximize my confidence that the code does, and will continue to do, what it's supposed to. Because different types of tests are needed to verify different properties of the code, the basic criteria for beauty vary. This chapter explores three ways tests can be beautiful:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - That Pesky Binary Search
- InhaltsvorschauTo demonstrate various testing techniques while keeping this chapter reasonably short, I need an example that's simple to describe and that can be implemented in a few lines of code. At the same time, the example must be juicy enough to provide some interesting testing challenges. Ideally, this example should have a long history of buggy implementations, demonstrating the need for thorough testing. And, last but not least, it would be great if this example itself could be considered beautiful code.It's hard to talk about beautiful code without thinking about Jon Bentley's classic book Programming Pearls (Addison-Wesley). As I was rereading the book, I hit the beautiful code example I was looking for: a binary search.As a quick refresher, a binary search is a simple and effective algorithm (but, as we'll see, tricky to implement correctly) to determine whether a presorted array of numbers x[0..n-1] contains a target element t. If the array contains t, the program returns its position in the array; otherwise, it returns
-1.Here's how Jon Bentley described the algorithm to the students:Binary search solves the problem by keeping track of the range within the array that holds t (if t is anywhere in the array). Initially, the range is the entire array. The range is shrunk by comparing its middle element to t and discarding half the range. The process continues until t is discovered in the array or until the range in which it must lie is known to be empty.He adds:Most programmers think that with the above description in hand, writing the code is easy. They are wrong. The only way to believe this is by putting down this column right now and writing the code yourself. Try it.I second Bentley's suggestion. If you have never implemented binary search, or haven't done so in a few years, I suggest you try that yourself before going forward; it will give you greater appreciation for what follows.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Introducing JUnit
- InhaltsvorschauWhen speaking of beautiful tests, it's hard not to think of the JUnit testing framework. Because I'm using Java, deciding to build my beautiful tests around JUnit was a very easy decision. But before I do that, in case you are not already familiar with JUnit, let me say a few words about it.JUnit is the brainchild of Kent Beck and Erich Gamma, who created it to help Java developers write and run automated and self-verifying tests. It has the simple, but ambitious, objective of making it easy for software developers to do what they should have done all along: test their own code.Unfortunately, we still have a long way to go before the majority of developers are test-infected (i.e., have experimented with developer testing and decided to make it a regular and important part of their development practices). However, since its introduction, JUnit (helped considerably by eXtreme Programming and other Agile methodologies, where developer involvement in testing is nonnegotiable) has gotten more programmers to write tests than anything else. Martin Fowler summed up JUnit's impact as follows: "Never in the field of software development was so much owed by so many to so few lines of code."JUnit is intentionally simple. Simple to learn. Simple to use. This was a key design criterion. Kent Beck and Erich Gamma took great pains to make sure that JUnit was so easy to learn and use that programmers would actually use it. In their own words:So, the number one goal is to write a framework within which we have some glimmer of hope that developers will actually write tests. The framework has to use familiar tools, so that there is little new to learn. It has to require no more work than absolutely necessary to write a new test. It has to eliminate duplicate effort.The official getting-started documentation for JUnit (the JUnit Cookbook) fits in less than two pages:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Nailing Binary Search
- InhaltsvorschauGiven its history, I am not going to be fooled by the apparent simplicity of binary search, or by the obviousness of the fix, especially because I've never used the unsigned bit shift operator (i.e., >>>) in any other code. I am going to test this fixed version of binary search as if I had never heard of it before, nor implemented it before. I am not going to trust anyone's word, or tests, or proofs, that this time it will really work. I want to be confident that it works as it should through my own testing. I want to nail it.Here's my initial testing strategy (or team of tests):
-
Start with smoke tests.
-
Add some boundary value tests.
-
Continue with various thorough and exhaustive types of tests.
-
Finally, add some performance tests.
Testing is rarely a linear process. Instead of showing you the finished set of tests, I am going to walk you through my thought processes while I am working on the tests.Let's get started with the smoke tests. These are designed to make sure that the code does the right thing when used in the most basic manner. They are the first line of defense and the first tests that should be written, because if an implementation does not pass the smoke tests, further testing is a waste of time. I often write the smoke tests before I write the code; this is called test-driven development (or TDD).Here's my smoke test for binary search:import static org.junit.Assert.*; import org.junit.Test; public class BinarySearchSmokeTest { @Test public void smokeTestsForBinarySearch() { int[] arrayWith42 = new int[] { 1, 4, 42, 55, 67, 87, 100, 245 }; assertEquals(2, Util.binarySearch(arrayWith42, 42)); assertEquals(-1, Util.binarySearch(arrayWith42, 43)); } }As you can tell, this test is really,Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Conclusion
- InhaltsvorschauIn this chapter, we have seen that even the best developers and the most beautiful code can benefit from testing. We have also seen that writing test code can be every bit as creative and challenging as writing the target code. And, hopefully, I've shown you that tests themselves can be considered beautiful in at least three different ways.Some tests are beautiful for their simplicity and efficiency. With a few lines of JUnit code, run automatically with every build, you can document the code's intended behavior and boundaries, and ensure that both of them are preserved as the code evolves.Other tests are beautiful because, in the process of writing them, they help you improve the code they are meant to test in subtle but important ways. They may not discover proper bugs or defects, but they bring to the surface problems with the design, testability, or maintainability of the code; they help you make your code more beautiful.Finally, some tests are beautiful for their breadth and thoroughness. They help you gain confidence that the functionality and performance of the code match requirements and expectations, not just on a few handpicked examples, but with a wide range of inputs and conditions.Developers who want to write beautiful code can learn something from artists. Painters regularly put down their brushes, step away from the canvas, circle it, cock their heads, squint, and look at it from different angles and under different lights. They need to develop and integrate those perspectives in their quest for beauty. If your canvas is an IDE and your medium is code, think of testing as your way of stepping away from the canvas to look at your work with critical eyes and from different perspectives—it will make you a better programmer and help you create more beautiful code.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 8: On-the-Fly Code Generation for Image Processing
- InhaltsvorschauCharles PetzoldAmong the pearls of wisdom and wackiness chronicled in Steven Levy's classic history Hackers: Heroes of the Computer Revolution (Doubleday), my favorite is this one by Bill Gosper, who once said, "Data is just a dumb kind of programming." The corollary, of course, is that code is just a smart kind of data—data designed to trigger processors into performing certain useful or amusing acts.The potential interplay of code and data tends to be discouraged in most conventional programming instruction. Code and data are usually severely segregated; even in object-oriented programming, code and data have their own special roles to play. Any intermingling of the two—such as data being executed as if it were machine code—is considered to be a violation of natural law.Only occasionally is this barrier between code and data breached. Compiler authors write programs that read source code and generate machine code, but compilers do not really violate the separation of code and data. Where the input and output are code to the human programmers, they are just data to the compilers. Other odd jobs, such as those performed by disassemblers or simulators, also read machine code as if it were data.As we all accept rationally, if not emotionally, code and data are ultimately just bytes, and there are only 256 of them in the entire universe. It's not the bytes themselves but their ordering that gives them meaning and purpose.In some special cases, it can be advantageous for programs that are not compilers to generate code while they're running. This on-the-fly code generation is not easy, so it's usually restricted to very particular circumstances.Throughout this chapter, we will use an example that embodies the most common reason for using on-the-fly code generation. In this example, a time-critical subroutine must perform many repetitive operations. A number of generalized parameters come into play during the execution of these repetitive operations, and the subroutine could run a lot faster if we replaced those generalized parameters with specific values. We can't replace those parameters while we're writing the subroutine because the parameters aren't known until runtime, and they can change from one invocation to the next. However, the subroutine itself could do the code generation while it's running. In other words, the subroutine can examine its own parameters at runtime, generate more efficient code, and then execute the resulting code.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 9: Top Down Operator Precedence
- InhaltsvorschauDouglas CrockfordIn 1973, vaughan pratt presented "top down operator precedence" at the first annual Principles of Programming Languages Symposium in Boston. In the paper, Pratt described a parsing technique that combines the best properties of Recursive Descent and the Operator Precedence syntax technique of Robert W Floyd.. He claimed that the technique is simple to understand, trivial to implement, easy to use, extremely efficient, and very flexible. I will add that it is also beautiful.It might seem odd that such an obviously utopian approach to compiler construction is completely neglected today. Why is this the case? Pratt suggested in the paper that preoccupation with BNF grammars and their various offspring, along with their related automata and theorems, have precluded development in directions that are not visibly in the domain of automata theory.Another explanation is that his technique is most effective when used in a dynamic, functional programming language. Its use in a static, procedural language would be considerably more difficult. In his paper, Pratt used LISP and almost effortlessly built parse trees from streams of tokens.But parsing techniques are not greatly valued in the LISP community, which celebrates the Spartan denial of syntax. There have been many attempts since LISP's creation to give the language a rich, ALGOL-like syntax, including:
- Pratt's CGOL
- LISP 2
- MLISP
- Dylan
- Interlisp's Clisp
- McCarthy's original M-expressions
All failed to find acceptance. The functional programming community found the correspondence between programs and data to be much more valuable than expressive syntax. But the mainstream programming community likes its syntax, so LISP has never been accepted by the mainstream. Pratt's technique befits a dynamic language, but dynamic language communities historically have had no use for the syntax that Pratt's technique realizes.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - JavaScript
- InhaltsvorschauThe situation changed with the advent of JavaScript. JavaScript is a dynamic, functional language, but in a syntactic sense, it is obviously a member of the C family. JavaScript is a dynamic language with a community that likes syntax. In addition, JavaScript is object oriented. Pratt's 1973 paper anticipated object orientation but lacked an expressive notation for it. JavaScript has an expressive object notation. Thus, JavaScript is an ideal language for exploiting Pratt's technique. I will show that we can quickly and inexpensively produce parsers in JavaScript.We don't have time in this short chapter to deal with the whole JavaScript language, and perhaps we wouldn't want to because the language is a mess. But it has some brilliant stuff in it, which is worthy of your consideration. We will build a parser that can process Simplified JavaScript, and we will write that parser in Simplified JavaScript. Simplified JavaScript is just the good stuff, including:
- Functions as first-class objects
-
Functions are lambdas with lexical scoping.
- Dynamic objects with prototypal inheritance
-
Objects are class-free. We can add a new member to any object by ordinary assignment. An object can inherit members from another object.
- Object literals and array literals.
-
This is a very convenient notation for creating new objects and arrays. JavaScript literals were the inspiration for the JSON data interchange (http://www.JSON.org) format.
We will take advantage of JavaScript's prototypal nature to make token objects that inherit from symbols, and symbols that inherit from an original symbol. We will depend on theobjectfunction, which makes a new object that inherits members from an existing object. Our implementation will also depend on a tokenizer that produces an array of simple token objects from a string. We will advance through the array of tokens as we weave our parse tree.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Symbol Table
- InhaltsvorschauWe will use a symbol table to drive our parser:
var symbol_table = {};Theoriginal_symbolobject will be the prototype for all other symbols. It contains methods that report errors. These will usually be overridden with more useful methods:var original_symbol = { nud: function ( ) { this.error("Undefined."); }, led: function (left) { this.error("Missingoperator."); } };Let's define a function that defines symbols. It takes a symbolidand an optional binding power that defaults to zero. It returns a symbol object for thatid. If the symbol already exists in thesymbol_table, it returns that symbol object. Otherwise, it makes a new symbol object that inherits fromoriginal_symbol, stores it in the symbol table, and returns it. A symbol object initially contains anid, a value, a left binding power, and the stuff it inherits from theoriginal_symbol:var symbol = function (id, bp) { var s = symbol_table[id]; bp = bp || 0; if (s) { if (bp >= s.lbp) { s.lbp = bp; } } else { s = object(original_symbol); s.id = s.value = id; s.lbp = bp; symbol_table[id] = s; } return s; };The following symbols are popular separators and closers:symbol(":"); symbol(";"); symbol(","); symbol(")"); symbol("]"); symbol("}"); symbol("else");The(end)symbol indicates that there are no more tokens. The(name)symbol is the prototype for new names, such as variable names. They are spelled strangely to avoid collisions:symbol("(end)"); symbol("(name)");The(literal)symbol is the prototype for all string and number literals:var itself = function ( ) { return this; }; symbol("(literal)").nud = itself;Thethissymbol is a special variable. In a method invocation, it is the reference to the object:symbol("this").nud = function ( ) { scope.reserve(this); this.arity = "this"; return this; };Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Tokens
- InhaltsvorschauWe assume that the source text has been transformed into an array of simple token objects
(tokens), each containing atypemember that is a string("name", "string", "number", "operator")and avaluemember that is a string or number.Thetokenvariable always contains the current token:var token;
Theadvancefunction makes a new token object and assigns it to thetokenvariable. It takes an optionalidparameter, which it can check against theidof the previous token. The new token object's prototype will be a name token in the current scope or a symbol from the symbol table. The new token'saritywill be"name", "literal", or"operator". Itsaritymay be changed later to"binary", "unary", or"statement"when we know more about the token's role in the program:var advance = function (id) { var a, o, t, v; if (id && token.id !== id) { token.error("Expected '" + id + "'."); } if (token_nr >= tokens.length) { token = symbol_table["(end)"]; return; } t = tokens[token_nr]; token_nr += 1; v = t.value; a = t.type; if (a === "name") { o = scope.find(v); } else if (a === "operator") { o = symbol_table[v]; if (!o) { t.error("Unknown operator."); } } else if (a === "string" || a === number") { a = "literal"; o = symbol_table["(literal)"]; } else { t.error("Unexpected token."); } token = object(o); token.value = v; token.arity = a; return token; };Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Precedence
- InhaltsvorschauTokens are objects that bear methods that allow them to make precedence decisions, match other tokens, and build trees (and in a more ambitious project also check types, optimize, and generate code). The basic precedence problem is this: given an operand between two operators, is the operand bound to the left operator or the right? Thus, if A and B are operators in:
dA e B f
does operandebind to A or to B? In other words, are we talking about:(dA e) B f
or:dA (e B f)
Ultimately, the complexity in the process of parsing comes down to the resolution of this ambiguity. The technique we will develop here uses token objects whose members include binding powers (or precedence levels), and simple methods callednud(null denotation) andled(left denotation). Anuddoes not care about the tokens to the left. Aleddoes. Anudmethod is used by values (such as variables and literals) and by prefix operators. Aledmethod is used by infix operators and suffix operators. A token may have both anudand aled. For example, -might be both a prefix operator (negation) and an infix operator (subtraction), so it would have bothnudandledmethods.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Expressions
- InhaltsvorschauThe heart of Pratt's technique is the
expressionfunction. It takes a right binding power that controls the aggressiveness of its consumption of tokens that it sees to the right. It returns the result of calling methods on the tokens it acts upon:var expression = function (rbp) { var left; var t = token; advance( ); left = t.nud( ); while (rbp < token.lbp) { t = token; advance( ); left = t.led(left); } return left; }expressioncalls thenudmethod of thetoken. Thenudis used to process literals, variables, and prefix operators. After that, as long as the right binding power is less than the left binding power of the next token, theledmethods are invoked. Theledis used to process infix and suffix operators. This process can be recursive because thenudandledmethods can callexpression.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Infix Operators
- InhaltsvorschauThe + operator is an infix operator, so it will have a
ledmethod that makes the token object into a tree, where the two branches are the operand to the left of the + and the operand to the right. The left operand is passed into theled, and the right is obtained by calling theexpressionmethod.The number60is the binding power of +. Operators that bind tighter or have higher precedence have greater binding powers. In the course of mutating the stream of tokens into a parse tree, we will use the operator tokens as containers of operand nodes:symbol("+", 60).led = function (left) { this.first = left; this.second = expression(60); this.arity = "binary"; return this; };When we define the symbol for *, we see that only theidand binding powers are different. It has a higher binding power because it binds more tightly:symbol("*", 70).led = function (left) { this.first = left; this.second = expression(70); this.arity = "binary"; return this; };Not all infix operators will be this similar, but many will, so we can make our work easier by defining aninfixfunction that will help us specify infix operators. Theinfixfunction will take anidand a binding power, and optionally aledfunction. If aledfunction is not provided, it will supply a defaultledthat is useful in most cases:var infix = function (id, bp, led) { var s = symbol(id, bp); s.led = led || function (left) { this.first = left; this.second = expression(bp); this.arity = "binary"; return this; }; return s; }This allows a more declarative style for specifying operators:infix("+", 60); infix("-", 60); infix("*", 70); infix("/", 70);The string === is JavaScript's exact-equality comparison operator:infix("===", 50); infix("!==", 50); infix("<", 50); infix("<=", 50); infix(">", 50); infix(">=", 50);Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Prefix Operators
- InhaltsvorschauWe can do a similar thing for prefix operators. Prefix operators are right associative. A prefix does not have a left binding power because it does not bind to the left. Prefix operators can sometimes be reserved words (reserved words are discussed in the section "Scope," later in this chapter):
var prefix = function (id, nud) { var s = symbol(id); s.nud = nud || function ( ) { scope.reserve(this); this.first = expression(80); this.arity = "unary"; return this; }; return s; } prefix("-"); prefix("!"); prefix("typeof");Thenudof ( will calladvance(")") to match a balancing ) token. The ( token does not become part of the parse tree because thenudreturns the expression:prefix("(", function ( ) { var e = expression(0); advance(")"); return e; });Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Assignment Operators
- InhaltsvorschauWe could use
infixrto define our assignment operators, but we want to do two extra bits of business, so we will make a specializedassignmentfunction. It will examine the left operand to make sure that it is a proper lvalue. We will also set anassignmentflag so that we can later quickly identify assignment statements:var assignment = function (id) { return infixr(id, 10, function (left) { if (left.id !== "." && left.id !== "[" && left.arity !== "name") { left.error("Bad lvalue."); } this.first = left; this.second = expression(9); this.assignment = true; this.arity = "binary"; return this; }); }; assignment("="); assignment("+="); assignment("-=");Notice that we have a sort of inheritance pattern, whereassignmentreturns the result of callinginfixr, andinfixrreturns the result of callingsymbol.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Constants
- InhaltsvorschauThe
constantfunction builds constants into the language. Thenudmutates a name token into a literal token:var constant = function (s, v) { var x = symbol(s); x.nud = function ( ) {scope.reserve(this); this.value = symbol_table[this.id].value; this.arity = "literal"; return this; }; x.value = v; return x; }; constant("true", true); constant("false", false); constant("null", null); constant("pi", 3.141592653589793);Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Scope
- InhaltsvorschauWe use functions such as
infixandprefixto define the symbols used in the language. Most languages have some notation for defining new symbols, such as variable names. In a very simple language, when we encounter a new word, we might give it a definition and put it in the symbol table. In a more sophisticated language, we would want a notion of scope, giving the programmer convenient control over the lifespan and visibility of a variable.A scope is a region of a program in which a variable is defined and accessible. Scopes can be nested inside of other scopes. Variables defined in a scope are not visible outside of the scope.We will keep the current scope object in thescopevariable:var scope;
original_scopeis the prototype for all scope objects. It contains adefinemethod that is used to define new variables in the scope. Thedefinemethod transforms a name token into a variable token. It produces an error if the variable has already been defined in the scope or if the name has already been used as a reserved word:var original_scope = { define: function (n) { var t = this.def[n.value]; if (typeof t === "object") { n.error(t.reserved ? "Already reserved." : "Already defined."); } this.def[n.value] = n; n.reserved = false; n.nud = itself; n.led = null; n.std = null; n.lbp = 0; n.scope = scope; return n; },Thefindmethod is used to find the definition of a name. It starts with the current scope and will go, if necessary, back through the chain of parent scopes and ultimately to the symbol table. It returnssymbol_table["(name")]if it cannot find a definition:find: function (n) { var e = this; while (true) { var o = e.def[n]; if (o) { return o; } e = e.parent; if (!e) { return symbol_table[ symbol_table.hasOwnProperty(n) ? n : "(name)"]; } } },Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Statements
- InhaltsvorschauPratt's original formulation worked with functional languages in which everything is an expression. Most mainstream languages have statements. We can easily handle statements by adding another method to tokens,
std(statement denotation). Anstdis like anudexcept that it is only used at the beginning of a statement.Thestatementfunction parses one statement. If the current token has anstdmethod, the token is reserved and thestdis invoked. Otherwise, we assume an expression statement terminated with a ;. For reliability, we reject an expression statement that is not an assignment or invocation:var statement = function ( ) { var n = token, v; if (n.std) { advance( ); scope.reserve(n); return n.std( ); } v = expression(0); if (!v.assignment && v.id !== "(") { v.error("Bad expression statement."); } advance(";"); return v; };Thestatementsfunction parses statements until it sees(end)or } signaling the end of a block. It returns a statement, an array of statements, or (if there were no statements present) simplynull:var statements = function ( ) {var a = [], s; while (true) { if (token.id === "}" || token.id === "(end)") { break; } s = statement( ); if (s) { a.push(s); } } return a.length === 0 ? null : a.length === 1 ? a[0] : a; };Thestmtfunction is used to add statements to the symbol table. It takes a statementidand anstdfunction:var stmt = function (s, f) { var x = symbol(s); x.std = f; return x; };The block statement wraps a pair of curly braces around a list of statements, giving them a new scope:stmt("{", function ( ) { new_scope( ); var a =statements( ); advance("}"); scope.pop( ); return a; });Theblockfunction parses a block:var block = function ( ) { var t = token; advance("{"); return t.std( ); };Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Functions
- InhaltsvorschauFunctions are executable object values. A function has an optional name (so that it can call itself recursively), a list of parameter names wrapped in parentheses, and a body that is a list of statements wrapped in curly braces. A function has its own scope:
prefix("function", function ( ) { var a = []; scope = new_scope( ); if (token.arity === "name") { scope.define(token); this.name = token.value; advance( ); } advance("("); if (token.id !== ")") { while (true) { if (token.arity !== "name") { token.error("Expected a parameter name."); } scope.define(token); a.push(token); advance( ); if (token.id !== ",") { break; } advance(","); } } this.first = a; advance(")"); advance("{"); this.second =statements( ); advance("}"); this.arity = "function"; scope.pop( ); return this; });Functions are invoked with the ( operator. It can take zero or more comma-separated arguments. We look at the left operand to detect expressions that cannot possibly be function values:infix("(", 90, function (left) { var a = []; this.first = left; this.second = a; this.arity = "binary"; if ((left.arity !== "unary" || left.id !== "function") && left.arity !== "name" && (left.arity !== "binary" || (left.id !== "." && left.id !== "(" && left.id !== "["))) { left.error("Expected a variable name."); } if (token.id !== ")") { while (true) { a.push(expression(0)); if (token.id !== ",") { break; } advance(","); } } advance(")"); return this; });Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Array and Object Literals
- InhaltsvorschauAn array literal is a set of square brackets around zero or more comma-separated expressions. Each of the expressions is evaluated, and the results are collected into a new array:
prefix("[", function ( ) { var a = []; if (token.id !== "]") { while (true) { a.push(expression(0)); if (token.id !== ",") { break; } advance(","); } } advance("]"); this.first = a; this.arity = "unary"; return this; });An object literal is a set of curly braces around zero or more comma-separated pairs. A pair is a key/expression pair separated by a :. The key is a literal or a name treated as a literal:prefix("{", function ( ) { var a = []; if (token.id !== "}") { while (true) { var n = token; if (n.arity !== "name" && n.arity !== "literal") { token.error("Bad key."); } advance( ); advance(":"); var v = expression(0); v.key = n.value; a.push(v); if (token.id !== ",") { break; } advance(","); } } advance("}"); this.first = a; this.arity = "unary"; return this; });Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Things to Do and Think About
- InhaltsvorschauThe simple parser shown in this chapter is easily extensible. The tree could be passed to a code generator, or it could be passed to an interpreter. Very little computation is required to produce the tree. And as we saw, very little effort was required to write the programming that built the tree.We could make the
infixfunction take an opcode that would aid in code generation. We could also have it take additional methods that would be used to do constant folding and code generation.We could add additional statements, such asfor, switch, andtry. We could add statement labels. We could add more error checking and error recovery. We could add lots more operators. We could add type specification and inference.We could make our language extensible. With the same ease that we can define new variables, we can let the programmer add new operators and new statements.You can try the demonstration of the parser that was described in this chapter at http://javascript.crockford.com/tdop/index.html.Another example of this parsing technique can be found in JSLint at http://JSLint.com.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Chapter 10: The Quest for an Accelerated Population Count
- InhaltsvorschauHenry S. Warren, Jr.A fundamental computer algorithm, and a deceptively simple one, is the population count or sideways sum, which calculates the number of bits in a computer word that are 1. The population count function has applications that range from the very simple to the quite sublime. For example, if sets are represented by bit strings, population count gives the size of the set. It can also be used to generate binomially distributed random integers. These and other applications are discussed at the end of this chapter.Although uses of this operation are not terribly common, many computers—often the supercomputers of their day—had an instruction for it. These included the Ferranti Mark I (1951), the IBM Stretch computer (1960), the CDC 6600 (1964), the Russian-built BESM-6 (1967), the Cray 1 (1976), the Sun SPARCv9 (1994), and the IBM Power 5 (2004).This chapter discusses how to compute the population count function on a machine that does not have that instruction, but which has the fundamental instructions generally found on a RISC or CISC computer: shift, add, and, load, conditional branch, and so forth. For illustration, we assume the computer has a 32-bit word size, but most of the techniques discussed here can be easily adapted to other word sizes.Two problems in population counting are addressed: counting the 1-bits in a single computer word, and counting the 1-bits in a large number of words, perhaps arranged in an array. In each case, we show that the obvious solution, even if carefully honed, can be beaten substantially by very different algorithms that take some imagination to find. The first is an application of the divide-and-conquer strategy, and the second is an application of a certain logic circuit that is familiar to computer logic designers but not so familiar to programmers.As a first cut, a programmer might count the 1-bits in a word x, as illustrated in the following C-language solution. HereEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Basic Methods
- InhaltsvorschauAs a first cut, a programmer might count the 1-bits in a word x, as illustrated in the following C-language solution. Here x is an unsigned integer, so the right shift is with 0-fill:
pop = 0; for (i = 0; i < 32; i++){ if (x & 1) pop = pop + 1; x = x >> 1; }On a typical RISC computer, the loop might compile into about seven instructions, two of which are conditional branches. (One of the conditional branches is for loop control.) These seven instructions are executed 32 times, but one of them is bypassed about half the time (we might presume), so that it executes about 32 x 6.5 = 208 instructions.It would probably not take the programmer long to realize that this code can be easily improved. For one thing, on many computers, counting down from 31 to 0 is more efficient than counting up, because it saves a compare instruction. Better yet, why count at all? Just let the loop go until x is 0. This eliminates some iterations if x has high-order 0-bits. Another optimization is to replace theiftest with code that simply adds the rightmost bit of x to the count. This leads to the code:pop = 0; while (x) { pop = pop + (x & 1); x = x >> 1; }This has only four or five RISC instructions in the loop, depending upon whether or not a compare of x to 0 is required, and only one branch. (We assume the compiler rearranges the loop so that the conditional branch is at the bottom.) Thus, it takes a maximum of 128 to 160 instructions. The maximum occurs if x begins with a 1-bit, but it will take far fewer instructions if x has many leading 0s.Some readers may recall that the simple expressionx&(x-1)is x with its least significant 1-bit turned off, or is 0 if x= 0. Thus, to count the 1-bits in x, one can turn them off one at a time until the result is 0, keeping count of how many were turned off. This leads to the code:pop = 0; while (x) { pop = pop + 1; x = x & (x - 1); }Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Divide and Conquer
- InhaltsvorschauAnother interesting and useful way to compute the population count of a word is based on the "divide and conquer" paradigm. This algorithm might be devised by reasoning, "Suppose I had a way to compute the population count of a 16-bit quantity. Then I could run that on the left and right halves of the 32-bit word, and add the results, to obtain the population count of the 32-bit word." This strategy won't pay off if the basic algorithm must be run sequentially on the two halves and it takes time proportional to the number of bits being analyzed, because it would then take 16k + 16k = 32k units of time, where k is the constant of proportionality, plus another instruction for the addition. But if we can somehow do the operation on the two halfwords in parallel, there will be an improvement from, essentially, 32k to 16k + 1.To efficiently compute the population count of two 16-bit quantities, we need a way to do it for 8-bit quantities, and to do 4 of them in parallel. Continuing this reasoning, we need a way to compute the population count of 2-bit quantities, and to do 16 of them in parallel.The algorithm to be described in no way depends on running operations on separate processors, or on unusual instructions such as the SIMDinstructions found on some computers. It uses only the facilities usually found on a conventional uniprocessor RISC or CISC.The plan is illustrated in .
Figure 10-1: Counting 1-bits, "divide and conquer" strategyThe first line in is the word x for which we wish to count the number of 1-bits. Each 2-bit field of the second line contains the count of the number of 1-bits in the 2-bit field immediately above. The subscripts are the decimal values of these 2-bit fields. Each 4-bit field in the third line contains the sum of the numbers in two adjacent 2-bit fields of the second line, with the subscripts showing the decimal values, and so forth. The last line contains the number of 1-bits inEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Other Methods
- InhaltsvorschauItem 169 in the HAKMEM memois an algorithm that counts the number of 1-bits in a word x by using the first three terms of the formula shown in the previous section on each 3-bit field of x separately, to produce a word of 3-bit fields, each of which contains the number of 1-bits that were in it. It then adds adjacent 3-bit fields to form 6-bit field sums, and then adds the 6-bit fields by computing the value of the word modulo 63. Although originally developed for a machine with a 36-bit word, the algorithm is easily adapted to a 32-bit word. This is shown below in C (the long constants are in octal):
int pop(unsigned x) { unsigned n; n = (x >> 1) & 033333333333; // Count bits in x = x - n; // each three-bit n = (n >> 1) & 033333333333; // field. x = x - n; x = (x + (x >> 3)) & 030707070707; // Six-bit sums. return x%63; // Add six-bit sums. }The last line uses the unsigned modulus function. (It could be either signed or unsigned if the word length were a multiple of 3.) It's clear that the modulus function sums the 6-bit fields when you regard the word x as an integer written in base 64. Upon dividing a base b integer by b − 1, the remainder is, for b ≥ 3, congruent to the sum of the digits and, of course, is less than b − 1. Because the sum of the digits in this case must be less than or equal to 32, mod(x, 63) must be equal to the sum of the digits of x, which is to say equal to the number of 1-bits in the original x.This algorithm requires only 10 instructions on the DEC PDP-10, as that machine has an instruction for computing the remainder with its second operand directly referencing a fullword in memory. On a basic RISC, it requires about 15 instructions, assuming the machine offers unsigned modulus as one instruction (but not directly referencing a fullword immediate or memory operand). But it is probably not very fast, because division is almost always a slow operation. Also, it doesn't apply to 64-bit word lengths by simply extending the constants, although it does work for word lengths up to 62.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Sum and Difference of Population Counts of Two Words
- InhaltsvorschauTo compute pop(x) + pop(y) (if your computer does not have the population count instruction), some time can be saved by using the first two executable lines of on x and y separately, and then adding x and y and executing the last three stages of the algorithm on the sum. After the first two lines of are executed, x and y consist of eight 4-bit fields, each containing a maximum value of 4. Thus x and y may safely be added, because the maximum value in any 4-bit field of the sum would be 8, so no overflow occurs. (In fact, three words may be combined in this way.)This idea also applies to subtraction. To compute pop(x) – pop(y), use:
pop(x) – pop(y) = pop(x) – (32 – pop(y)) = pop(x) + pop(y) – 32
Then, use the technique just described to compute pop(x) + pop(y). The code is shown in . It uses 32 instructions, versus 43 for two applications of the code of followed by a subtraction.Example 10-2. Computing pop(x) – pop(y)int popDiff(unsigned x, unsigned y) { x = x - ((x >> 1) & 0x55555555); x = (x & 0x33333333) + ((x >> 2) & 0x33333333); y = ~y; y = y - ((y >> 1) & 0x55555555); y = (y & 0x33333333) + ((y >> 2) & 0x33333333); x = x + y; x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F); x = x + (x >> 8); x = x + (x >> 16); return (x & 0x0000007F) - 32; }Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Comparing the Population Counts of Two Words
- InhaltsvorschauSometimes one wants to know which of two words has the larger population count, without regard to the actual counts. Can this be determined without doing a population count of the two words? Computing the difference of two population counts, as in , and comparing the result to 0, is one way, but there is another way that is preferable if either the population counts are expected to be low, or if there is a strong correlation between the particular bits that are set in the two words.The idea is to clear a single bit in each word until one of the words is all zero; the other word then has the larger population count. The process runs faster in its worst and average cases if the bits that are 1 at the same positions in each word are first cleared. The code is shown in . The procedure returns a negative number if pop(x) < pop(y), 0 if pop(x) = pop(y), and a positive number (1) if pop(x) > pop(y).Example 10-3. Comparing pop(x) with pop(y)
int popCmpr(unsigned xp, unsigned yp) { unsigned x, y; x = xp & ~yp; // Clear bits where y = yp & ~xp; // both are 1. while (1){ if (x == 0) return y | -y; if (y == 0) return 1; x = x & (x - 1); // Clear one bit y = y & (y - 1); // from each. } }After clearing the common 1-bits in each 32-bit word, the maximum possible number of 1-bits in both words together is 32. Therefore the word with the smaller number of 1-bits can have at most 16, and the loop in is executed a maximum of 16 times, which gives a worst case of 119 instructions executed on a basic RISC (16 x 7 + 7). A simulation using uniformly distributed random 32-bit numbers showed that the average population count of the word with the smaller population count is approximately 6.186, after clearing the common 1-bits. This gives an average execution time of about 50 instructions when executed on random 32-bit inputs, not as good as using . For this procedure to beat that of , the number of 1-bits in eitherEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Counting the 1-Bits in an Array
- InhaltsvorschauThe simplest way to count the number of 1-bits in an array (vector) of fullwords, in the absence of the population count instruction, is to use a procedure such as that of on each word of the array, and simply add the results. We call this the naïve method. Ignoring loop control, the generation of constants, and loads from the array, it takes 16 instructions per word: 15 for the code of , plus 1 for the addition. We assume the procedure is expanded inline, the masks are loaded outside of the loop, and the machine has a sufficient number of registers to hold all the quantities used in the calculation.Another way is to use the first two executable lines of on groups of three words in the array, adding the three partial results. Because each partial result has a maximum value of 4 in each 4-bit field, the sum of the three has a maximum value of 12 in each 4-bit field, so no overflow occurs. This idea can be applied to the 8-and 16-bit fields. Coding and compiling this method indicates that it gives about a 20 percent reduction over the naíve method in total number of instructions executed on a basic RISC. Much of the savings is canceled by the additional housekeeping instructions required. We will not dwell on this method because there is a much better way to do it.The better way seems to have been invented by Robert Harley and David Seal in about 1996.It is based on a circuit called a carry-save adder (CSA) or 3:2 compressor. A CSA is simply a sequence of independent full adders and is often used in binary multiplier circuits.In Boolean algebra notation (juxtaposition denotes and, + denotes or, and ⊕ denotes exclusive or), the logic for each full adder is:
h ← ab + ac + bc = ab + (a + b)c = ab + (a ⊕ b)c l ← (a ⊕ b) ⊕ c
where a, b, and c are the 1-bit inputs, l is the low-bit output (sum), and h is the high-bit output (carry). Changing a + b on the first line toEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Applications
- InhaltsvorschauThe population count instruction has a miscellany of uses. As mentioned at the beginning of this chapter, one use is to compute the size of a set when sets are represented by bit strings. In this representation, there is a "universe" set whose members are numbered sequentially. A set is represented by a bit string in which bit i is 1 if and only if member i is in the set.
Figure 10-3: A circuit for the total population count of seven wordsAnother simple application is to compute the Hamming distance between two bit vectors, a concept from the theory of error-correcting codes. The Hamming distance is simply the number of places where the vectors differ, that is:dist(x, y) = pop(x ⊕ y)
The population count instruction may be used to compute the number of trailing 0s in a word, using relations such as:ntz(x) = pop(¬x & (x – 1)) = 32 – pop(x | –x)
(The reader who is not familiar with these mixtures of arithmetic and logical operations might pause for a few moments to discover why they work.) The functionntz(x) also has a miscellany of uses. For example, some early computers, upon interrupt, would store a "reason for interrupt" bit in a special register. The bits were placed in a position that identified which type of interrupt occurred. The positions were chosen in priority order, usually with the higher-priority interrupts in the less significant positions. Two or more bits could be set at the same time. To determine which interrupt to process, the supervisor program would execute thentzfunction on the quantity in the special register.Another application of population count is to allow reasonably fast direct indexed access to a moderately sparse array A that is represented in a certain compact way. In the compact representation, only the defined, or nonzero, elements of the array are stored. There is an auxiliary bit stringEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Chapter 11: Secure Communication: The Technology Of Freedom
- InhaltsvorschauAshish GulhatiI speak of none other than the computer that is to come after me. A computer whose merest operational parameters I am not worthy to calculate—and yet I will design it for you. A computer which can calculate the Question to the Ultimate Answer, a computer of such infinite and subtle complexity that organic life itself shall form part of its operational matrix.Deep Thought, The Hitchhiker's Guide to the GalaxyIn mid-1999 i flew to costa rica to work with laissez faire city, a group that was working to create software systems to help usher in a new era of individual sovereignty.The group at LFC was working primarily to develop a suite of software designed to protect and enhance individual rights in the digital age, including easy-to-use secure email, online dispute mediation services, an online stock exchange, and a private asset trading and banking system. My interest in many of the same technologies had been piqued long ago by the cypherpunks list and Bruce Schneier's Applied Cryptography (Wiley), and I'd already been working on prototype implementations of some of these systems.The most fundamental of these were systems to deliver strong and usable communications privacy to just about everybody.When I stepped into LFC's sprawling "interim consulate" outside San José, Costa Rica, they had a working prototype of a secure webmail system they called MailVault. It ran on Mac OS 9, used FileMaker as its database, and was written in Frontier. Not at all the mix of technologies you'd want to run a mission-critical communications service on, but that's what the programmers had produced.It was no surprise the system crashed early and often, and was extremely fragile. It could hardly support two concurrent users. LFC was facing a credibility crisis with its investors, as their software releases had been delayed many times, and their first beta of MailVault, the flagship product, was no gem. So in the free time left over from my contract network and system administration work at LFC, I started writing a new secure mail system from scratch.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- The Heart of the Start
- InhaltsvorschauDeveloping Cryptonite and marketing and supporting associated services single-handedly for many years (with unwavering support and many invaluable ideas from my wife, Barkha) has been an incredibly interesting and rewarding journey, not only from a development perspective but also from an entrepreneurial one.Before jumping into the nitty-gritty of the system, I thought I'd touch upon some points that have impressed themselves strongly in my consciousness over the course of this project:
-
My friend Rishab Ghosh once quipped that there's a lot of hype about how the Internet can enable wired hackers to work from anywhere, but most of the people who create this hype live within a small area in California. The great thing about an independent startup project is that it really can be done anywhere, and dropped and picked up again when convenient. I've hacked on Cryptonite over many years on four continents, and it may well be the first high-quality software application developed in large part in the Himalayan mountains. (I used the word "wired" before loosely. In reality, five wireless technologies facilitated our connectivity in the Himalayas: a VSAT satellite Internet link, Wi-Fi, Bluetooth, GPRS, and CDMA.)
-
When working on a project as a single developer in your spare time, remember the old hacker wisdom that "six months in the lab can save you ten minutes in the library." It's critical to maximize your reuse of existing code libraries. For this reason, I elected to develop the system in Perl, a popular and flexible high-level language with a rich library of mature, free software modules, and the Perl hacker's first virtue of laziness informed every design decision.
-
Especially for end-user application software, ease of use is a critical issue. It is essential to the function of such code to present a simple, accessible interface to the user. The usability considerations of developing an end-user security application are even more significant, and were in fact a key factor in making the Cryptonite system worth developing.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Untangling the Complexity of Secure Messaging
- InhaltsvorschauWhile bringing secure communications capabilities to the world is a whoppingly great idea for the protection of individual human rights (more on this later), getting it right is a trickier task than it may seem. Public-key cryptosystems can, in principle, facilitate ad hoc secure communications, but practical implementations are very often needlessly complex and disconnected from on-the-ground realities concerning who will use such systems, and how.The fundamental problem to be solved in practical implementations based on public-key cryptography is key authentication. To send an encrypted message to someone, you need her public key. If you can be tricked into using the wrong public key, your privacy vanishes.There are two very different approaches to the key authentication problem.The conventional Public Key Infrastructure (PKI) approach, typically based on ISO standard X.509, depends on a system of trusted third-party Certification Authorities (CAs), and is in many ways fundamentally unsuited to meet the real needs of users in ad hoc networks. PKI implementations have achieved significant success in more structured domains, such as corporate VPNs and the authentication of secure web sites, but have made little headway in the real-world heterogeneous email environment.The other approach is exemplified by the most popular public-key-based messaging security solution in use today: Phil Zimmermann's PGP and its descendants, now formalized as the IETF OpenPGP protocol. OpenPGP preserves the flexibility and fundamentally decentralized nature of public-key cryptography by facilitating distributed key authentication through "webs of trust" rather than depending on a centralized, hierarchical system of CAs, as PKI approaches do (including OpenPGP's primary competitor, S/MIME). Not surprisingly, S/MIME, which is almost ubiquitously available in popular email clients, enjoys a vastly smaller user base than OpenPGP, despite email clients' general lack of comprehensive support for OpenPGP.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Usability Is the Key
- InhaltsvorschauEmail privacy software often requires users to jump through too many hoops, so very few bother to use it. Usability is critical to the success of any security solution, because if the system isn't usable, it will end up being bypassed or used in an insecure manner, in either case defeating its whole purpose.A case study of the usability of PGP conducted at Carnegie Mellon University in 1998 pointed out the specialized challenges of creating an effective and usable interface for email encryption and found that of 12 study participants, all of whom were experienced at using email, "only one-third of them were able to use PGP to correctly sign and encrypt an email message when given 90 minutes in which to do so."I saw Cryptonite as an interesting project in terms of designing a secure, reliable, and efficient email system while achieving a very high level of usability. I set out to create a web-mail system that would embed OpenPGP security into the very structure of the email experience, and help even casual users to effectively utilize OpenPGP to achieve communications privacy. The webmail format was chosen specifically because it could bring powerful communications privacy technology to anyone with access to an Internet café, or a cellphone with a web browser, not just to the few able to run desktop email encryption software on powerful computers.Cryptonite was designed to make encryption a normal part of everyday email, not by masking the complexities of the public-key cryptosystems that it relies on, but rather by making the elements of these systems clearer and more accessible to the user. Usability considerations were thus central to Cryptonite's design and development, as was manifested in a number of ways:
- Development of UI functionality from user feedback and usability studies
-
The CMU user study provided many good ideas for the initial design, and many features evolved out of usability testing with Cryptonite itself by casual email users. The interface was kept clean, minimalist, and consistent, with all important actions being at most one or two clicks away at all times.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - The Foundation
- InhaltsvorschauApplication software today, of course, is many levels removed from the bare hardware and builds on top of many layers of existing code. So when starting a new project, getting the foundation right has to be the crucial starting point.For a number of reasons, I chose to write Cryptonite in Perl. The rich pool of open source reusable modules on CPAN (http://www.cpan.org) helped minimize the need to write new code where existing solutions could be leveraged, and also allowed a great deal of flexibility in interfaces and options. This was borne out well by prior experience with the language as well as by later experiences with the Cryptonite project.The ability to interface to C and other libraries through Perl's XS API allowed access to even more libraries. Perl's excellent portability and robust support for object-oriented programming were other important advantages. Cryptonite was intended to be easily modifiable by licensees, which would also be facilitated by writing it in Perl.So, the Cryptonite system is implemented entirely in object-oriented Perl. The project has led to the creation of numerous open source Perl modules, which I have made available on CPAN.GNU/Linux jumped out as the obvious development platform, because code developed on a Unix-like environment would be easiest to port to whatever deployment platform it would be used on, which could only be another Unix-like platform. No Windows or Mac system at the time (OS X was in pre-beta) had what it took to run mission-critical software to be used concurrently by thousands of users. Linux was my preferred desktop environment anyway, so it was also the default choice.In 2001, development and deployment moved to OpenBSD, and since 2003, development has proceeded on OS X and OpenBSD (as well as Linux). OS X was chosen for its out-of-box usability as a portable primary desktop, combined with its Unix-like underpinnings and ability to run a wide variety of open source software. OpenBSD was chosen as a deployment platform for its reliability, superlative security record, and focus on code quality and code auditing.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- The Test Suite
- InhaltsvorschauBecause automated testing is a crucial component of long-term development, I developed a test suite simultaneously with the project code.The clean separation of the core from the interface makes it easy to test both components separately, as well as to quickly diagnose bugs and pinpoint where they are in the code. Writing tests for cmaild is just a matter of calling its methods with valid (or invalid) inputs and making sure that the return codes and values are as expected.The test suite for cmaild uses the client API calls
cmdopen(to open a connection to the Cryptonite Mail Daemon),cmdsend(to send an API call to the daemon), andcmdayt(to send an "Are you there?" ping to the server):use strict; use Test; BEGIN { plan tests => 392, todo => [] } useCryptonite::Mail::HTML qw (&cmdopen &cmdsend &cmdayt); $Test::Harness::Verbose = 1; my ($cmailclient, $select, $sessionkey); my ($USER, $CMAILID, $PASSWORD) = 'test'; my $a = $Cryptonite::Mail::Config::CONFIG{ADMINPW}; ok(sub { # 1: cmdopen my $status; ($status, $cmailclient, $select) = cmdopen; return $status unless $cmailclient; 1; }, 1); ok(sub { # 2: newuser my $status = cmdsend('test.pl', $a, $cmailclient, $select, 'newuser', $USER); return $status unless $status =~ /^\+OK.*with password (.*)$/; $PASSWORD = $1; 1; }, 1); ...Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - The Functioning Prototype
- InhaltsvorschauFor the first prototype, I used a simple object persistence module, Persistence::Object::Simple (which my friend Vipul had written for a project we'd worked on earlier) to whip up a basic user database. Using persistent objects helped keep the code clean and intuitive, and also provided a straightforward upgrade path to production database engines (simply create or derive a compatible Persistence::Object:: class for the database engine).In late 2002, Matt Sergeant created another simple prototype-to-production path for Perl hackers, DBD::SQLite module, a "self-contained RDBMS in a DBI driver," which can be used for rapid prototyping of database code without the need for a full database engine during development. Personally, though, I prefer the elegance and simplicity of persistent objects to having my code littered with SQL queries and DBI calls.Mail received into the Cryptonite system was saved to regular mbox files, which worked fine for the prototype. Of course, a production implementation would have to use a more sophisticated mail store. I decided to use PGP itself as the encryption backend, to avoid rewriting (and maintaining) all the encryption functionality already contained in PGP.GnuPG was coming along, and I kept in mind that I might want to use it for cryptography support in the future. So, I wrote Crypt::PGP5 to encapsulate the PGP5 functionality in a Perl module. This module is available from CPAN (though I haven't updated it in ages).For the cryptographic core of Crypt::PGP5, I could have used the proprietary PGPSDK library, but I would have had to create a Perl interface to it, which would likely have been more work than just using the PGP binary. So, with a healthy dose of Perlish laziness and keeping in mind that TMTOWTDI, I decided to use the Expect module to automate interactions with the PGP binary, using the same interface that's available to human users of the program. This worked well enough for the first prototype.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Clean Up, Plug In, Rock On…
- InhaltsvorschauAfter developing the initial prototype of Cryptonite in Costa Rica, I continued working on it independently. After a much needed cleanup of the code (prototype development had been hectic and had left not much time to refactor or test the code), I worked on a number of Perl modules and components that would be needed next, to make the jump from a simple prototype to a scalable product. These included Crypt::GPG (with an interface almost identical to that of Crypt::PGP5, so that switching to GnuPG for the crypto operations in Cryptonite involved little more than a single-line change to the code), and Persistence::Database::SQL and Persistence::Object::Postgres (which provide object persistence in a Postgres database, with a similar interface to Persistence::Object::Simple, making the backend database switch quite seamless as well).Persistence::Object::Postgres, like Persistence::Object::Simple, uses a blessed reference to a hash container to store key-value pairs, which can be committed to the database with a
commitmethod call. It also uses Perl's Tie mechanism to tie Postgres' large objects (BLOBs) to filehandles, enabling natural filehandle-based access to large binary objects in the data-base. One of the major benefits of Persistence::Database::SQL over Persistence::Object:: Simple, of course, is that it enables proper queries into a real database. For example, with Persistence::Object::Simple, there's no clean way to quickly search for a particular user's record, whereas with Persistence::Database::SQL, getting a specific user record from the database is straightforward:sub _getuser { # Get a user object from the database. my $self = shift; my $username = shift; $self->db->table('users'); $self->db->template($usertmpl); my ($user) = $self->db->select("WHERE USERNAME = '$username'"); return $user; }With Persistence::Object::Simple one would have to either iterate over all the persistent objects in the data directory or resort to a hack such as directly grepping the plaintext persistence files in the data directory.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Hacking in the Himalayas
- InhaltsvorschauThrough 2000 and 2001 I was able to work on Cryptonite only intermittently, both because of other commitments and because the project needed peace and quiet, which was in limited supply when I was traveling around and living in chaotic, cacophonous, polluted Indian cities.In the summer of 2002, my wife and I took a vacation in the Himalayas, where I finally managed to get the time to finish writing major chunks of the code, including adding important key management abilities to Crypt::GPG, and creating an integrated interface for key management, which is a critical part of the whole web-of-trust mechanism. The core of this management interface, the Edit Key dialog, is shown in . It enables fingerprint verification, the viewing and creation of user identity certifications, and the assigning of trust values to keys.I also ported the system over to OpenBSD, which would be the ultimate deployment platform.We already had all the other major components for a secure email service in place, and as it would still take some time to get Cryptonite ready for public use, we decided to go ahead and launch a commercial secure email service right away. This would enable me to spend more time on Cryptonite development, and to begin building a community of testers immediately.So in mid-2003, we launched the Neomailbox secure IMAP, POP3, and SMTP email service. In the following years, this proved to be an excellent move that would help fund development, freeing me from the need to take on other contract work and simultaneously keeping me in close touch with the market for secure, private messaging.In the fall of 2003, we set up a semi-permanent development base in a small Himalayan hamlet, about 2000 meters above sea level, and this is primarily where development has progressed since then. This kept our cash burn low, which is critical for a bootstrapping startup, and gave me lots of time and peace to work on Neomailbox and Cryptonite.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- The Invisible Hand Moves
- InhaltsvorschauBy mid-2004, Neomailbox was a year old and had attracted quite a few paying customers. Cryptonite development was put on hold for a bit while I worked on developing various aspects of the Neomailbox service as well as on a few other projects I just couldn't wait to get started on.But being out in the market was great, as it brought market forces, from competition to user feedback, to bear on the development process, and helped sharpen and clarify priorities. Customer requests and queries helped keep me intimately connected to what the users and the market wanted. Meeting the market's demands is how application code becomes beautiful in a commercial sense, after all, so interaction with the market became an integral and critical component of the development process.Cryptonite was designed to be easy to maintain and modify, precisely because I knew that at some point it would have to start to evolve in new ways, both in response to and in anticipation of what the customer wanted. Being in the market enabled me to see that emerging demand: it was clear that IMAP was the future of remote mailbox access.IMAP has a lot of attractive features that make it a very powerful and practical mail access protocol. One of the most important of these is the ability to access the same mailbox using multiple clients, which becomes increasingly important with the proliferation of computing devices. The typical user now has a desktop, a laptop, a PDA, and a cellphone, all capable of accessing her mailbox.This posed a slight problem, as I'd already implemented a full mail store for Cryptonite, and it was not IMAP-based. There were two ways forward: either implement a full IMAP server based on the Cryptonite mail store (a big job), or modify Cryptonite to enable it to use an IMAP mail store as a backend. In fact, the second would have to be done either way.Again, opting to reduce complexity of the system, and focusing on its primary purpose, I decided not to develop the Cryptonite mail store into a full-blown IMAP server. Instead, I modified it into a caching mechanism, which caches MIME skeletons (just the structure information, without the content) of multipart MIME messages listed by the user, and also entire messages read by the user, so that the next time a user opens a message she's read before, Cryptonite doesn't need to go back to the IMAP server to fetch it again.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Speed Does Matter
- InhaltsvorschauWith the Cryptonite mail store, performance had been quite snappy, and most mail store operations had been independent of mailbox size. But with the switch to IMAP, I noticed some major slowdowns with large mailboxes. A little profiling revealed that the performace bottleneck was the pure-Perl Mail::IMAPClient module, which I'd used to implement the IMAP capability.A quick benchmark script (written using the Benchmark module) helped me check whether another CPAN module, Mail::Cclient, which interfaces to the UW C-Client library, was more efficient than Mail::IMAPClient. The results showed clearly that I'd have to redo the IMAP code using Mail::Cclient:
Rate IMAPClientSearch CclientSearch IMAPClientSearch 39.8/s -- -73% CclientSearch 145/s 264% -- Rate IMAPClientSort CclientSort IMAPClientSort 21.3/s -- -99% CclientSort 2000/s 9280% --
I probably should have thought of benchmarking the different modules before writing the code with Mail::IMAPClient. I'd originally avoided the C-Client library because I wanted to keep the build process as simple as possible, and Mail::IMAPClient's build process is definitely simpler than that of Mail::Cclient.Fortunately, the switch from the former to the latter was generally quite straightforward. For some operations, I noticed that IMAPClient could do the job better than C-Client without much of a performance penalty, so Cryptonite::Mail::Service now uses both modules, each to do whatever it's better at.A program like Cryptonite is never "finished," of course, but the code is now mature, robust, full of features, and efficient enough to serve its purpose: to provide thousands of concurrent users a secure, intuitive, and responsive email experience while helping them effectively protect the privacy of their communications.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Communications Privacy for Individual Rights
- InhaltsvorschauI mentioned at the beginning of this chapter that making secure communications technology widely available is a very effective means of protecting individual rights. As this recognition is the basic motivation behind the Cryptonite project, I'd like to end with a few observations on this point.Cryptographic technology can, among other things:
-
Provide strong life-saving protection for activists, NGOs, and reporters working in repressive countries.
-
Preserve the communicability of censored news and controversial ideas.
-
Protect the anonymity of whistle-blowers, witnesses, and victims of domestic abuse.
-
For dessert, catalyze the geodesic society by enabling the free and unfettered exchange of ideas, goods, and services globally.
The motley crew of hackers known as the Cypherpunks has been developing privacy-enhancing software for years, with the intent of enhancing personal freedom and individual sovereignty in the digital age. Some cryptographic software is already a cornerstone of how the world works today. This includes the Secure SHell (SSH) remote terminal software, which is essential to securing the Internet's infrastructure, and the Secure Sockets Layer (SSL) encryption suite, which secures online commerce.But these systems target very specific needs: secure server access and secure online credit card transactions, respectively. Both are concerned with securing human-machine interactions. Much more cryptographic technology specifically targeted at human-to-human interactions needs to be deployed in the coming years to combat the advancing menace of ubiquitous surveillance (which "leads to a quick end to civilization").An easy-to-use, secure webmail system is an enabling technology—it makes possible, for the first time in history, secure and private long-distance communications between individuals all over the globe, who never need meet in person.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Hacking the Civilization
- InhaltsvorschauThis computer of such infinite and subtle complexity that it includes organic life itself in its operational matrix—the Earth, and the civilizations it hosts—can be reprogrammed in powerful ways by simple pieces of code that hack human culture right back, and rewire the operating system of society itself.Code has changed the world many times. Consider the medical advances made possible by genetic sequencing software, the impact of business software on large enterprises and small businesses alike, the revolutions enabled by industrial automation software and computer modeling, or the multiple revolutions of the Internet: email, the Web, blogs, social networking services, VoIP…. Clearly, much of the history of our times is the story of software innovation.Of course, code, like any technology, can cut both ways, either increasing or decreasing the "returns to violence" in society, increasing the efficacy of privacy-violating technology and giving tyrants more effective tools of censorship, on the one hand, or enhancing and promoting individual rights on the other. Code of either sort hacks the very core of human society itself, by altering fundamental social realities such as the tenability of free speech.Interestingly, even with a specific technology such as public key cryptography, the implementation chosen can significantly alter cultural realities. For example, a PKI-based implementation reimposes authoritarian properties such as centralized hierarchies and identification requirements on a technology whose entire value arguably lies in its lack of those properties. Despite this, PKI approaches deliver weaker key authentication than does a web-of-trust implementation (which also doesn't dilute other important features of public-key cryptography, such as distributed deployment).I think that as the weavers of code, it is to a large extent the ethical responsibility of programmers to seek not only that our code be beautiful in its design and implementation, but also that it be beautiful in its results, in a larger social context. This is why I find freedom code so beautiful. It puts computing technology to the most sublime use: protecting human rights and human life.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 12: Growing Beautiful Code in BioPerl
- InhaltsvorschauLincoln SteinIn the past decade, biology has blossomed into an information science. New technologies provide biologists with unprecedented windows into the intricate processes going on inside the cells of animals and plants. DNA sequencing machines allow the rapid readout of complete genome sequences; microarray technologies give snapshots of the complex patterns of gene expression in developing organisms; confocal microscopes produce 3-D movies to track changes in cell architecture as precancerous tissues turn malignant.These new technologies routinely generate terabytes of data, all of which must be filtered, stored, manipulated and data-mined. The application of computer science and software engineering to the problems of biological data management is called bioinformatics.Bioinformatics is similar in many ways to software engineering on Wall Street. Like software engineers in the financial sector, bioinformaticians need to be fleet of foot: they have to get applications up and running quickly with little time for requirement analysis and design. Data sets are large and mutable, with shelf lives measured in months, not years. For this reason, most bioinformatics developers favor agile development techniques, such as eXtreme Programming, and toolkits that allow for rapid prototyping and deployment. As in the financial sector, there is also a strong emphasis on data visualization and pattern recognition.One of the rapid development toolkits developed by and for bioinformaticians is BioPerl, an extensive open source library of reusable code for the Perl programming language. BioPerl provides modules to handle most common bioinformatics problems involving DNA and protein analysis, the construction and analysis of evolutionary trees, the interpretation of genetic data, and, of course, genome sequence analysis.BioPerl allows a software engineer to rapidly create complex pipelines to process, filter, analyze, integrate, and visualize large biological datasets. Because of its extensive testing by the open source community, applications built on top of BioPerl are more likely to work right the first time, and because Perl interpreters are available for all major platforms, applications written with BioPerl will run on Microsoft Windows, Mac OS X, Linux, and Unix machines.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- BioPerl and the Bio::Graphics Module
- InhaltsvorschauOne of the rapid development toolkits developed by and for bioinformaticians is BioPerl, an extensive open source library of reusable code for the Perl programming language. BioPerl provides modules to handle most common bioinformatics problems involving DNA and protein analysis, the construction and analysis of evolutionary trees, the interpretation of genetic data, and, of course, genome sequence analysis.BioPerl allows a software engineer to rapidly create complex pipelines to process, filter, analyze, integrate, and visualize large biological datasets. Because of its extensive testing by the open source community, applications built on top of BioPerl are more likely to work right the first time, and because Perl interpreters are available for all major platforms, applications written with BioPerl will run on Microsoft Windows, Mac OS X, Linux, and Unix machines.This chapter discusses Bio::Graphics, BioPerl's genome map rendering module. The problem it addresses is how to visualize a genome and its annotations. A genome consists of a set of DNA sequences, each a string of the letters [A,G,C,T], which are nucleotides, also known as base pairs, or bp. Some of the DNA sequence strings can be quite long: for example, the human genome consists of 24 DNA sequence strings, one each for chromosomes 1 through 22, plus the X and Y chromosomes. The longest of these, chromosome 1, is roughly 150,000,000 bp long (150 megabases).Hidden inside these DNA sequences are multiple regions that play roles in cell metabolism, reproduction, defense, and signaling. For example, some sections of the chromosome 1 DNA sequence are protein-coding genes. These genes are "transcribed" by the cell into shorter RNA sequences that are transported from the cell nucleus into the cytoplasm; these RNA sequences are then translated into proteins responsible for generating energy, moving nutrients into and out of the cell, making the cell membrane, and so on. Other regions of the DNA sequence are regulatory in nature: when a regulatory protein binds to a specific regulatory site, a nearby protein-coding gene is "turned on" and starts to be transcribed. Some regions correspond to parasitic DNA: short regions of sequence that can replicate themselves semiautonomously and hitchhike around on the genome. Still other regions are of unknown significance; we can tell that they're important because they have been conserved among humans and other organisms across long evolutionary intervals, but we don't yet understand what they do.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- The Bio::Graphics Design Process
- InhaltsvorschauI'm not enamored of formal design engineering; instead, I typically write out little snippets of pseudocode that describe how I want something to work (a "code story"), play a bit with input and output formats, do a bit of coding, and—if I'm not satisfied with how the system is fitting together—go back and rework the code story. For anything larger than a toy application, I implement little bits of the system and test them out in standalone programs before deciding whether to move forward with that part of the design. I keep my notes in a "stream of consciousness" text file, and commit code to my CVS repository often. I try to make all the code visually appealing and elegant. If it isn't elegant, something's wrong with the design, and I go back to the drawing board.My first design task was to figure out the flow of a typical Bio::Graphics application. I started with the code story shown in .Example 12-1. Basic story for BioPerl::Graphics use (pseudocode)
1 useBio::Graphics::Panel; 2 @first_set_of_features = get_features_somehow( ); 3 @second_set_of_features = get_more_features_somehow( ); 4 $panel_object = Bio::Graphics::Panel->new(panel_options...) 5 $panel_object->add_track(\@first_set_of_features, track_options...); 6 $panel_object->add_track(\@second_set_of_features, track_options...); 7 print $panel_object->png;The code story starts out by bringing in the main Bio::Graphics object class, Bio::Graphics:: Panel (line 1). This object, I reasoned, would hold configuration options for the image as a whole, such as the dimensions of the resulting diagram and its scale (base pairs per pixel), and would be the main object that users interact with.The code story continues with two calls to fetch arrays of sequence features (lines 2–3). In order to maintain independence from feature databases, I decided that Bio::Graphics would operate on lists of features that had already been retrieved from a database. Fortunately, BioPerl already supported a variety of annotation databases via a nicely generic interface. A sequence feature is represented by an interface called Bio::SeqFeatureI, which specifies a set of methods that all sequence features should support. The methods are mostly straightforward; for example,Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Extending Bio::Graphics
- InhaltsvorschauWe'll now look at some of the Bio::Graphics extensions that were added after the initial release. This illustrates how code evolves in response to user input.One of the objectives of Bio::Graphics was to support interactive browsable views of the genome using web-based applications. My basic idea for this was that a CGI script would process a fill-out form indicating the genome to browse and a region to display. The script would make the database connection, process the user's request, find the region or regions of interest, pull out the features in the corresponding region, and pass them to Bio::Graphics. Bio::Graphics would render the image, and the CGI script would incorporate this data into an
<IMG>tag for display.The one thing missing from this picture was the ability to generate an image map for the generated image. An image map is necessary to support the user's ability to click on a glyph and get more information about it. Image maps also make it possible to make tool tips appear when the user mouses over the glyph and to perform such dynamic HTML tasks as populating a pull-down menu when the user right-clicks on the glyph.To support image map generation, the original version of Bio::Graphics had a single method calledboxes(). This returned an array containing the glyph bounding rectangles, the features associated with each glyph, and the glyph objects themselves. To generate an image map, developers had to step through this array and generate the image map HTML manually.Unfortunately, this was not as easy to do as I hoped it would be, judging from the number of user-support requests I received. So, after some experience writing my own Bio:: Graphics-based genome browser, I added animage_and_map()method to Bio::Graphics:: Panel. Here is a code fragment that illustrates how to use this method:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Conclusions and Lessons Learned
- InhaltsvorschauDesigning software to be used by other developers is a challenge. It has to be easy and straightforward to use because developers are just as impatient as everyone else, but it can't be so dumbed-down that it loses functionality. Ideally, a code library must be immediately usable by naïve developers, easily customized by more sophisticated developers, and readily extensible by experts.I think Bio::Graphics hits this sweet spot. Developers new to BioPerl can get started right away by writing simple scripts that use familiar BioPerl objects such as Bio::SeqFeature:: Generic. Intermediate developers can customize the library's output by writing callbacks, while the most sophisticated developers can extend the library with custom glyphs.Bio::Graphics also illustrates the power of standard interfaces. Because it was designed to render any object that follows BioPerl's Bio::SeqFeatureI interface, it will work hand-in-hand with any of BioPerl's sequence data access modules. Bio::Graphics can generate diagrams of handcoded sequence features as easily as it can display features read from a flat file, retrieved from a database query, or generated by a web service and transmitted across the network.The module also has a few warts, and if I had to reimplement it now, I would have done several things differently. A major issue is the way that subglyphs are generated. In the current implementation, if you assign a glyph to a feature and the feature has subfeatures, the subglyphs will all be of the same type as the top-level glyph.This has two drawbacks. First, one must use subclassing to create composite glyphs in which the subglyphs reuse code from a previously defined class and the parent glyph is something new. Second, glyph methods always have to be aware of which level of features they are currently rendering. For example, to create a glyph in which the top level is represented as a dotted octagon and the subfeatures are represented as rectangles, theEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 13: The Design of the Gene Sorte
- InhaltsvorschauJim KentThis chapter is about a moderate-sized program i wrote called the gene sorter. The size of the Gene Sorter code is larger than the projects described in most of the other chapters, about 20,000 lines in all. Though there are some smaller pieces of the Gene Sorter that are quite nice, for me the real beauty is how easy it is to read, understand, and extend the program as a whole. In this chapter, I'll present an overview of what the Gene Sorter does, highlight some of the more important parts of the code, and then discuss the issues involved in making programs longer than a thousand lines enjoyable and even beautiful to work with.The Gene Sorter helps scientists rapidly sift through the roughly 25,000 genes in the human genome to find those most relevant to their research. The program is part of the http://genome.ucsc.edu web site, which also contains many other tools for working with data generated by the Human Genome Project. The Gene Sorter design is simple and flexible. It incorporates many lessons we learned in two previous generations of programs that serve biomedical data over the Web. The program uses CGI to gather input from the user, makes queries into a MySQL database, and presents the results in HTML. About half of the program code resides in libraries shared with other genome.ucsc.edu tools.The human genome is a digital code that somehow contains all of the information needed to build a human body, including that most remarkable of organs, the human brain. The information is stored in three billion bases of DNA. Each base can be an A, C, G, or T. Thus, there are two bits of information per base, or 750 megabytes of information in the genome.It is remarkable that the information to build a human being could fit easily into a memory stick in your pocket. Even more remarkably, we know from an evolutionary analysis of many genomes that only about 10 percent of that information is actually needed. The other 90 percent of the genome consists primarily of relics from evolutionary experiments that turned into dead ends, and in the clutter left by virus-like elements known asEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- The User Interface of the Gene Sorter
- InhaltsvorschauThe Gene Sorter can collect all the known genes in disease-related regions of DNA into a list of candidate genes. This list is displayed in a table, illustrated in , that includes summary information on each gene and hyperlinks to additional information. The candidate list can be filtered to eliminate genes that are obviously not relevant, such as genes expressed only in the kidneys when the viewer is researching a genetic disease of the lungs. The Gene Sorter is also useful in other contexts where one wants to look at more than one gene at once, such as when one is studying genes that are expressed in similar ways or genes that have similar known functions. The Gene Sorter is available currently for the human, mouse, rat, fruit fly, and C. elegans genomes.The controls on the top of the screen specify which version of which genome to use. The table underneath contains one row per gene.
Figure 13-1: Main page of the Gene SorterA separate configuration page controls which columns are displayed in the table and how they are displayed. A filter page can be used to filter out genes based on any combination of column values.The table can be sorted a number of ways. In this case, it is sorted by proximity to the selected gene, SYP. SYP is a gene involved with the release of neurotransmitters.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Maintaining a Dialog with the User over the Web
- InhaltsvorschauThe Gene Sorter is a CGI script. When the user points her web browser to the Gene Sorter's URL (http://genome.ucsc.edu/cgi-bin/hgNear), the web server runs the script and sends the output over the network. The script's output is an HTML form. When the user hits a button on the form, the web browser sends a URL to the web server that includes the values in the drop-down menus and other controls encoded as a series of variable=value pairs. The web server runs the script once more, passing the variable=value pairs as input. The script then generates a new HTML form in response.CGI scripts can be written in any language. The Gene Sorter script is actually a moderately large program written in C.A CGI script has both advantages and disadvantages compared to other programs that interact with users on the desktop. CGI scripts are quite portable and do not need different versions to support users on Windows, Macintosh, and Linux desktops. On the other hand, their interactivity is only modest. Unless one resorts to JavaScript (which introduces serious portability issues of its own), the display will be updated only when the user presses a button and waits a second or two for a new web page. However, for most genomic purposes, CGI provides an acceptably interactive and very standard user interface.The lifetime of a CGI script is very limited. It starts in response to the user clicking on something and finishes when it generates a web page. As a consequence, the script can't keep long-term information in program variables. For very simple CGI scripts, all of the necessary information is stored in the variable=value pairs (also known as CGI variables).However, for more complex scripts such as the Gene Sorter, this is not sufficient because the script might need to remember the result of a control that the user set several screens back, but the web server sends only the CGI variables corresponding to the controls in the most recently submitted page. Our CGI scripts therefore need a way to store data for the long term.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- A Little Polymorphism Can Go a Long Way
- InhaltsvorschauInside most programs of any flexibility, there is likely to be a polymorphic object of some sort. The table that takes up most of the main page of the Gene Sorter is composed of a series of polymorphic column objects.Making polymorphic objects in C is not as easy as it is in more modern object-oriented languages, but it can be done in a relatively straightforward manner using a
structin place of an object, and function pointers in place of polymorphic methods. shows a somewhat abbreviated version of the C code for the column object.Example 13-1. The column structure, a polymorphic object in Cstruct column /* A column in the big table. The central data structure for * hgNear. */ { /* Data set guaranteed to be in each column. */ struct column *next; /* Next column in list. */ char *name; /* Column name, not seen by user. */ char *shortLabel; /* Column label. */ char *longLabel; /* Column description. */ /* -- Methods -- */ void (*cellPrint)(struct column *col, struct genePos *gp, struct sqlConnection *conn); /* Print one cell of this column in HTML. */ void (*labelPrint)(struct column *col); /* Print the label in the label row. */ void (*filterControls)(struct column *col, struct sqlConnection *conn); /* Printout controls for advanced filter. */ struct genePos *(*advFilter)(struct column *col, struct sqlConnection *conn, /* Return list of positions for advanced filter. */ /* Lookup tables use the next few fields. */ char *table; /* Name of associated table. */ char *keyField; /* GeneId field in associated table. */ char *valField; /* Value field in associated table. */ /* Association tables use these as well as the lookup fields. */ char *queryFull; /* Query that returns 2 columns key/value. */ char *queryOne; /* Query that returns value, given key. */ char *invQueryOne; /* Query that returns key, given value. */ };Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Filtering Down to Just the Relevant Genes
- InhaltsvorschauFilters are one of the most powerful features of the gene sorter. Filters can be applied to each of the columns in order to view just the genes relevant to a particular purpose. For instance, a filter on the gene expression column can be used to find genes that are expressed in the brain but not in other tissues. A filter on the genome position can find genes on the X chromosome. A combination of these filters could find brain-specific genes found on the X chromosome. These genes would be particularly interesting to researchers on autism, since that condition appears to be to a fairly strong degree sex-linked.Each column has two filter methods:
filterControlsto write the HTML for the filter user interface andadvFilterto actually run the filter. These two methods communicate with each other through cart variables that use a naming convention that includes the program name, the lettersas, and the column name as prefixes to the specific variable name. In this way, different columns of the same type have different cart variables, and filter variables can be distinguished from other variables. A helpful routine namedcartFindPrefix, which returns a list of all variables with a given prefix, is heavily used by the filter system.The filters are arranged as a chain. Initially, the program constructs a list of all genes. Next it checks the cart to see whether any filters are set. If so, it calls the filters for each column. The first filter gets the entire gene list as input. Subsequent filters start with the output of the previous filter. The order in which the filters are applied doesn't matter.The filters are the most speed-critical code in the Gene Sorter. Most of the code is executed on just 50 or 100 genes, but the filters work on tens of thousands. To keep good interactive response time, the filter should spend less than 0.0001 of a second per gene. A modern CPU operates so fast that generally 0.0001s is not much of a limitation. However, a disk seek still takes about 0.005s, so the filter must avoid causing seeks.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Theory of Beautiful Code in the Large
- InhaltsvorschauThe Gene Sorter is one of the more beautiful programs at the design and code level that I've worked on. Most of the major parts of the system, including the cart, the directory of .ra riles, and the interface to the genomics database, are on their second or third iterations and incorporate lessons we learned from previous programs. The structure of the program's objects nicely parallels the major components of the user interface and the relational databases. The algorithms used are simple but effective, and make good trade-offs between speed, memory usage, and code complexity. The program has had very few bugs compared to most programs its size. Other people are able to come up to speed on the code base and contribute to it relatively quickly.Programming is a human activity, and perhaps the resource that limits us most when programming is our human memory. We can typically keep a half-dozen things in our short-term memory. Any more than that requires us to involve our long-term memory as well. Our long-term memory system actually has an amazingly large capacity, but we enter things into it relatively slowly, and we can't retrieve things from it randomly, only by association.While the structure of a program of no more than a few hundred lines can be dictated by algorithmic and machine considerations, the structure of larger programs must be dictated by human considerations, at least if we expect humans to work productively to maintain and extend them in the long term.Ideally, everything that you need to understand a piece of code should fit on a single screen. If not, the reader of the code will be forced at best to hop around from screen to screen in hopes of understanding the code. If the code is complex, the reader is likely to have forgotten what is defined on each screen by the time he gets back to the initial screen, and will actually have to memorize large amounts of the code before he can understand any part of it. Needless to say, this will slow down programmers, and many of them will find it frustrating as well.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Conclusion
- InhaltsvorschauThis chapter has been about one of the prettier pieces of code I've written. The program serves a useful purpose for biomedical researchers. The cart system makes it relatively easy to construct an interactive program over the Web, even though it uses a CGI interface. Both the user's model and the programmer's model of the program revolve around the idea of a big table with one row per gene and a variable number of columns that can represent many different types of data.Although the Gene Sorter is written in C, the column code is done in a straightforward, polymorphic, object-oriented design. Additional columns can be added by editing simple text files with no additional programming required, and these same files help make it easy for a single version of the program to work on different genomic databases associated with a variety of organisms.The design minimizes disk seeks, which continue to be a bottleneck, lagging CPU and memory speeds by a large margin. The code is written to be readable and reusable. I hope you will find some of the principles it is built on useful in your own programs.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 14: How Elegant Code Evolves with Hardware The Case of Gaussian Elimination
- InhaltsvorschauJack Dongarra and Piotr LuszczekThe increasing availability of advanced-architecture computers, at affordable costs, has had a significant effect on all spheres of scientific computation. In this chapter, we'll show the need for designers of computing algorithms to make expeditious and substantial adaptations to algorithms, in reaction to architecture changes, by closely examining one simple but important algorithm in mathematical software: Gaussian elimination for the solution of linear systems of equations.At the application level, science has to be captured in mathematical models, which in turn are expressed algorithmically and ultimately encoded as software. At the software level, there is a continuous tension between performance and portability on the one hand, and understandability of the underlying code. We'll examine these issues and look at trade-offs that have been made over time. Linear algebra—in particular, the solution of linear systems of equations—lies at the heart of most calculations in scientific computing. This chapter focuses on some of the recent developments in linear algebra software designed to exploit advanced-architecture computers over the decades.There are two broad classes of algorithms: those for dense matrices and those for sparse matrices. A matrix is called sparse if it contains a substantial number of zero elements. For sparse matrices, radical savings in space and execution time can be achieved through specialized storage and algorithms. To narrow our discussion and keep it simple, we will look only at the dense matrix problem (a dense matrix is defined as one with few zero elements).Much of the work in developing linear algebra software for advanced-architecture computers is motivated by the need to solve large problems on the fastest computers available. In this chapter, we'll discuss the development of standards for linear algebra software, the building blocks for software libraries, and aspects of algorithm design as influenced by the opportunities for parallel implementation. We'll explain motivations for this work, and say a bit about future directions.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- The Effects of Computer Architectures on Matrix Algorithms
- InhaltsvorschauThe key motivation in the design of efficient linear algebra algorithms for advanced-architecture computers involves the storage and retrieval of data. Designers wish to minimize the frequency with which data moves between different levels of the memory hierarchy. Once data is in registers or the fastest cache, all processing required for this data should be performed before it gets evicted back to the main memory. Thus, the main algorithmic approach for exploiting both vectorization and parallelism in our implementations uses block-partitioned algorithms, particularly in conjunction with highly tuned kernels for performing matrix-vector and matrix-matrix operations (the Level-2 and Level-3 BLAS). Block partitioning means that the data is divided into blocks, each of which should fit within a cache memory or a vector register file.The computer architectures considered in this chapter are:
-
Vector machines
-
RISC computers with cache hierarchies
-
Parallel systems with distributed memory
-
Multi-core computers
Vector machines were introduced in the late 1970s and early 1980s. They were able in one step to perform a single operation on a relatively large number of operands stored in vector registers. Expressing matrix algorithms as vector-vector operations was a natural fit for this type of machine. However, some of the vector designs had a limited ability to load and store the vector registers in main memory. A technique called chaining allowed this limitation to be circumvented by moving data between the registers before accessing main memory. Chaining required recasting linear algebra in terms of matrix-vector operations.RISC computers were introduced in the late 1980s and early 1990s. While their clock rates might have been comparable to those of the vector machines, the computing speed lagged behind due to their lack of vector registers. Another deficiency was their creation of a deep memory hierarchy with multiple levels of cache memory to alleviate the scarcity of band-width that was, in turn, caused mostly by a limited number of memory banks. The eventual success of this architecture is commonly attributed to the right price point and astonishing improvements in performance over time as predicted by Moore's Law. With RISC computers, the linear algebra algorithms had to be redone yet again. This time, the formulations had to expose as many matrix-matrix operations as possible, which guaranteed good cache reuse.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- A Decompositional Approach
- InhaltsvorschauAt the basis of solutions to dense linear systems lies a decompositional approach. The general idea is the following: given a problem involving a matrix A, one factors or decomposes A into a product of simpler matrices from which the problem can easily be solved. This divides the computational problem into two parts: first determine an appropriate decomposition, and then use it in solving the problem at hand.Consider the problem of solving the linear system:
Ax = bwhere A is a nonsingular matrix of order n. The decompositional approach begins with the observation that it is possible to factor A in the form:A = LUwhere L is a lower triangular matrix (a matrix that has only zeros above the diagonal) with ones on the diagonal, and U is upper triangular (with only zeros below the diagonal). During the decomposition process, diagonal elements of A (called pivots) are used to divide the elements below the diagonal. If matrix A has a zero pivot, the process will break with division-by-zero error. Also, small values of the pivots excessively amplify the numerical errors of the process. So for numerical stability, the method needs to interchange rows of the matrix or make sure pivots are as large (in absolute value) as possible. This observation leads to a row permutation matrix P and modifies the factored form to:P T A = LU
The solution can then be written in the form:x = A -1 Pb
which then suggests the following algorithm for solving the system of equations:-
Factor A
-
Solve the system Ly = Pb
-
Solve the system Ux = y
This approach to matrix computations through decomposition has proven very useful for several reasons. First, the approach separates the computation into two stages: the computation of a decomposition, followed by the use of the decomposition to solve the problem at hand. This can be important, for example, if different right hand sides are present and need to be solved at different points in the process. The matrix needs to be factored only once and reused for the different righthand sides. This is particularly important because the factorization ofEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- A Simple Version
- InhaltsvorschauFor the first version, we present a straightforward implementation of LU factorization. It consists of n–1 steps, where each step introduces more zeros below the diagonal, as shown in .
Figure 14-1: LU factorizationA tool often used to teach Gaussian elimination is MATLAB. It features a scripting language (also called MATLAB) that makes developing matrix algorithms very simple. The language might seem very unusual to people familiar with other scripting languages because it is oriented to process multidimensional arrays. The unique features of the language that we use in the example code are:-
Transposition operator for vectors and matrices: ' (single quote)
-
Matrix indexing specified as:— Simple integer values:
A(m, k)— Ranges:A(k:n, k)— Other matrices:A([k m], : ) -
Built-in matrix functions such as
size(returns matrix dimensions),tril(returns the lower triangular portion of the matrix),triu(returns the upper triangular portion of the matrix), andeye(returns an identity matrix, which contains only zero entries, except for the diagonal, which is all ones)
shows the simple implementation.Example 14-1. Simple variant (MATLAB coding)function [L,U,p] = lutx(A) %LUTX Triangular factorization, textbook version % [L,U,p] = lutx(A) produces a unit lower triangular matrix L, % an upper triangular matrix U, and a permutation vector p, % so that L*U = A(p,:) [n,n] = size(A); p = (1:n)'; for k = 1:n-1 % Find index 'm' of largest element 'r' below diagonal in k-th column [r,m] = max(abs(A(k:n,k))); m = m+k-1; % adjust 'm' so it becomes a global index % Skipelimination if column is zero if (A(m,k) ~= 0) % Swap pivot row if (m ~= k) A([k m],:) = A([m k],:); % swap rows 'k' and 'm' of 'A' p([k m]) = p([m k]); % swap entrix 'k' and 'm' of 'p' end % Compute multipliers i = k+1:n; A(i,k) = A(i,k)/A(k,k); % Update the remainder of the matrix j = k+1:n; A(i,j) = A(i,j) - A(i,k)*A(k,j); end end % Separate result L = tril(A,-1) + eye(n,n); U = triu(A);Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- LINPACK's DGEFA Subroutine
- InhaltsvorschauThe performance issues with the MATLAB version of the code continued as, in the mid-1970s, vector architectures became available for scientific computations. Vector architectures exploit pipeline processing by running mathematical operations on arrays of data in a simultaneous or pipelined fashion. Most algorithms in linear algebra can be easily vectorized. Therefore, in the late 1970s there was an effort to standardize vector operations for use in scientific computations. The idea was to define some simple, frequently used operations and implement them on various systems to achieve portability and efficiency. This package came to be known as the Level-1 Basic Linear Algebra Subprograms (BLAS) or Level-1 BLAS.The term Level-1 denotes vector-vector operations. As we will see, Level-2 (matrix-vector operations), and Level-3 (matrix-matrix operations) play important roles as well.In the 1970s, the algorithms of dense linear algebra were implemented in a systematic way by the LINPACK project. LINPACK is a collection of Fortran subroutines that analyze and solve linear equations and linear least-squares problems. The package solves linear systems whose matrices are general, banded, symmetric indefinite, symmetric positive definite, triangular, and tridiagonal square. In addition, the package computes the QR and singular value decompositions of rectangular matrices and applies them to least-squares problems.LINPACK uses column-oriented algorithms, which increase efficiency by preserving locality of reference. By column orientation, we mean that the LINPACK code always references arrays down columns, not across rows. This is important since Fortran stores arrays in column-major order. This means that as one proceeds down a column of an array, the memory references proceed sequentially through memory. Thus, if a program references an item in a particular block, the next reference is likely to be in the same block.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- LAPACK DGETRF
- InhaltsvorschauAs mentioned before, the introduction in the late 1970s and early 1980s of vector machines brought about the development of another variant of algorithms for dense linear algebra. This variant was centered on the multiplication of a matrix by a vector. These subroutines were meant to give improved performance over the dense linear algebra sub-routines in LINPACK, which were based on Level-1 BLAS. Later on, in the late 1980s and early 1990s, with the introduction of RISC-type microprocessors (the "killer micros") and other machines with cache-type memories, we saw the development of LAPACK Level-3 algorithms for dense linear algebra. A Level-3 code is typified by the main Level-3 BLAS, which, in this case, is matrix multiplication.The original goal of the LAPACK project was to make the widely used LINPACK library run efficiently on vector and shared-memory parallel processors. On these machines, LIN-PACK is inefficient because its memory access patterns disregard the multilayered memory hierarchies of the machines, thereby spending too much time moving data instead of doing useful floating-point operations. LAPACK addresses this problem by reorganizing the algorithms to use block matrix operations, such as matrix multiplication, in the innermost loops (see the paper by E. Anderson and J. Dongarra listed under "Further Reading"). These block operations can be optimized for each architecture to account for its memory hierarchy, and so provide a transportable way to achieve high efficiency on diverse modern machines.Here, we use the term "transportable" instead of "portable" because, for fastest possible performance, LAPACK requires that highly optimized block matrix operations be implemented already on each machine. In other words, the correctness of the code is portable, but high performance is not—if we limit ourselves to a single Fortran source code.LAPACK can be regarded as a successor to LINPACK in terms of functionality, although it doesn't always use the same function-calling sequences. As such a successor, LAPACK was a win for the scientific community because it could keep LINPACK's functionality while getting improved use out of new hardware.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Recursive LU
- InhaltsvorschauSetting the block size parameter for the LAPACK's LU might seem like a trivial matter at first. But in practice, it requires a lot of tuning for various precisions and matrix sizes. Many users end up leaving the setting unchanged, even if the tuning has to be done only once at installation. This problem is exacerbated by the fact that not just one but many LAPACK routines use a blocking parameter.Another issue with LAPACK's formulation of LU is the factorization of tall and narrow panels of columns performed by the DGETF2 routine. It uses Level-1 BLAS and was found to become a bottleneck as the processors became faster throughout the 1990s without corresponding increases in memory bandwidth.A solution came from a rather unlikely direction: divide-and-conquer recursion. In place of LAPACK's looping constructs, the newer recursive LU algorithm splits the work in half, factorizes the left part of the matrix, updates the rest of the matrix, and factorizes the right part. The use of Level-1 BLAS is reduced to an acceptable minimum, and most of the calls to Level-3 BLAS operate on larger portions of the matrix than LAPACK's algorithm. And, of course, the block size does not have to be tuned anymore.Recursive LU required the use of Fortran 90, which was the first Fortran standard to allow recursive subroutines. A side effect of using Fortran 90 was the increased importance of the LDA parameter, the leading dimension of A. It allows more flexible use of the subroutine, as well as performance tuning for cases when matrix dimension m would cause memory bank conflicts that could significantly reduce available memory bandwidth.The Fortran 90 compilers use the LDA parameter to avoid copying the data into a contiguous buffer when calling external routines, such as one of the BLAS. Without LDA, the compiler has to assume the worst-case scenario when input matrix a is not contiguous and needs to be copied to a temporary contiguous buffer so the call to BLAS does not end up with an out-of-bands memory access. With LDA, the compiler passes array pointers to BLAS without any copies.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- ScaLAPACK PDGETRF
- InhaltsvorschauLAPACK is designed to be highly efficient on vector processors, high-performance "super-scalar" workstations, and shared-memory multiprocessors. LAPACK can also be used sat-isfactorily on all types of scalar machines (PCs, workstations, and mainframes). However, LAPACK in its present form is less likely to give good performance on other types of parallel architectures—for example, massively parallel Single Instruction Multiple Data (SIMD) machines, or Multiple Instruction Multiple Data (MIMD) distributed-memory machines. The ScaLAPACK effort was intended to adapt LAPACK to these new architectures.By creating the ScaLAPACK software library, we extended the LAPACK library to scalable MIMD, distributed-memory, concurrent computers. For such machines, the memory hierarchy includes the off-processor memory of other processors, in addition to the hierarchy of registers, cache, and local memory on each processor.Like LAPACK, the ScaLAPACK routines are based on block-partitioned algorithms in order to minimize the frequency of data movement between different levels of the memory hierarchy. The fundamental building blocks of the ScaLAPACK library are distributed-memory versions of the Level-2 and Level-3 BLAS, and a set of Basic Linear Algebra Communication Subprograms (BLACS) for communication tasks that arise frequently in parallel linear algebra computations. In the ScaLAPACK routines, all interprocessor communication occurs within the distributed BLAS and the BLACS, so the source code of the top software layer of ScaLAPACK looks very similar to that of LAPACK.The ScaLAPACK solution to LU factorization is shown in .Example 14-5. ScaLAPACK variant (Fortran 90 coding)
SUBROUTINE PDGETRF( M, N, A, IA, JA, DESCA, IPIV, INFO ) INTEGER BLOCK_CYCLIC_2D, CSRC_, CTXT_, DLEN_, DTYPE_, $ LLD_, MB_, M_, NB_, N_, RSRC_ PARAMETER ( BLOCK_CYCLIC_2D = 1, DLEN_ = 9, DTYPE_ = 1, $ CTXT_ = 2, M_ = 3, N_ = 4, MB_ = 5, NB_ = 6, $ RSRC_ = 7, CSRC_ = 8, LLD_ = 9 ) DOUBLE PRECISION ONE PARAMETER ( ONE = 1.0D+0 ) CHARACTER COLBTOP, COLCTOP, ROWBTOP INTEGER I, ICOFF, ICTXT, IINFO, IN, IROFF, J, JB, JN, $ MN, MYCOL, MYROW, NPCOL, NPROW INTEGER IDUM1( 1 ), IDUM2( 1 ) EXTERNAL BLACS_GRIDINFO, CHK1MAT, IGAMN2D, PCHK1MAT, PB_TOPGET, $ PB_TOPSET, PDGEMM, PDGETF2, PDLASWP, PDTRSM, PXERBLA INTEGER ICEIL EXTERNAL ICEIL INTRINSIC MIN, MOD * Get grid parameters ICTXT = DESCA( CTXT_ ) CALL BLACS_GRIDINFO( ICTXT, NPROW, NPCOL, MYROW, MYCOL ) * Test the input parameters INFO = 0 IF( NPROW.EQ.-1 ) THEN INFO = -(600+CTXT_) ELSE CALL CHK1MAT( M, 1, N, 2, IA, JA, DESCA, 6, INFO ) IF( INFO.EQ.0 ) THEN IROFF = MOD( IA-1, DESCA( MB_ ) ) ICOFF = MOD( JA-1, DESCA( NB_ ) ) IF( IROFF.NE.0 ) THEN INFO = -4 ELSE IF( ICOFF.NE.0 ) THEN INFO = -5 ELSE IF( DESCA( MB_ ).NE.DESCA( NB_ ) ) THEN INFO = -(600+NB_) END IF END IF CALL PCHK1MAT( M, 1, N, 2, IA, JA, DESCA, 6, 0, IDUM1, IDUM2, INFO ) END IF IF( INFO.NE.0 ) THEN CALL PXERBLA( ICTXT, 'PDGETRF', -INFO ) RETURN END IF IF( DESCA( M_ ).EQ.1 ) THEN IPIV( 1 ) = 1 RETURN ELSE IF( M.EQ.0 .OR. N.EQ.0 ) THEN RETURN END IF * Split-ring topology for the communication along process rows CALL PB_TOPGET( ICTXT, 'Broadcast', 'Rowwise', ROWBTOP ) CALL PB_TOPGET( ICTXT, 'Broadcast', 'Columnwise', COLBTOP ) CALL PB_TOPGET( ICTXT, 'Combine', 'Columnwise', COLCTOP ) CALL PB_TOPSET( ICTXT, 'Broadcast', 'Rowwise', 'S-ring' ) CALL PB_TOPSET( ICTXT, 'Broadcast', 'Columnwise', ' ' ) CALL PB_TOPSET( ICTXT, 'Combine', 'Columnwise', ' ' ) * Handle the first block of columns separately MN = MIN( M, N ) IN = MIN( ICEIL( IA, DESCA( MB_ ) )*DESCA( MB_ ), IA+M-1 ) JN = MIN( ICEIL( JA, DESCA( NB_ ) )*DESCA( NB_ ), JA+MN-1 ) JB = JN - JA + 1 * Factor diagonal and subdiagonal blocks and test for exact * singularity. CALL PDGETF2( M, JB, A, IA, JA, DESCA, IPIV, INFO ) IF( JB+1.LE.N ) THEN * Apply interchanges to columns JN+1:JA+N-1. CALL PDLASWP('Forward', 'Rows', N-JB, A, IA, JN+1, DESCA, IA, IN, IPIV ) * Compute block row of U. CALL PDTRSM( 'Left', 'Lower', 'No transpose', 'Unit', JB, $ N-JB, ONE, A, IA, JA, DESCA, A, IA, JN+1, DESCA ) * IF( JB+1.LE.M ) THEN * Update trailing submatrix. CALL PDGEMM( 'No transpose', 'No transpose', M-JB,N-JB, JB, $ -ONE, A, IN+1, JA, DESCA, A, IA, JN+1, DESCA, $ ONE, A, IN+1, JN+1, DESCA ) END IF END IF * Loop over the remaining blocks of columns. DO 10 J = JN+1, JA+MN-1, DESCA( NB_ ) JB = MIN( MN-J+JA, DESCA( NB_ ) ) I = IA + J - JA * * Factor diagonal and subdiagonal blocks and test for exact * singularity. * CALL PDGETF2( M-J+JA, JB, A, I, J, DESCA, IPIV, IINFO ) * IF( INFO.EQ.0 .AND. IINFO.GT.0 ) INFO = IINFO + J - JA * * Apply interchanges to columns JA:J-JA. * CALL PDLASWP('Forward', 'Rowwise', J-JA, A, IA, JA, DESCA, I,I+JB-1, IPIV) IF( J-JA+JB+1.LE.N ) THEN * Apply interchanges to columns J+JB:JA+N-1. CALL PDLASWP( 'Forward', 'Rowwise', N-J-JB+JA, A, IA, J+JB, $ DESCA, I, I+JB-1, IPIV ) * Compute block row of U. CALL PDTRSM( 'Left', 'Lower', 'No transpose', 'Unit', JB, $ N-J-JB+JA, ONE, A, I, J, DESCA, A, I, J+JB, $ DESCA ) IF( J-JA+JB+1.LE.M ) THEN * Update trailing submatrix. CALL PDGEMM( 'No transpose', 'No transpose', M-J-JB+JA, $ N-J-JB+JA, JB, -ONE, A, I+JB, J, DESCA, A, $ I, J+JB, DESCA, ONE, A, I+JB, J+JB, DESCA ) END IF END IF 10 CONTINUE IF( INFO.EQ.0 ) INFO = MN + 1 CALL IGAMN2D(ICTXT, 'Rowwise', ' ', 1, 1, INFO, 1, IDUM1,IDUM2, -1,-1, MYCOL) IF( INFO.EQ.MN+1 ) INFO = 0 CALL PB_TOPSET( ICTXT, 'Broadcast', 'Rowwise', ROWBTOP ) CALL PB_TOPSET( ICTXT, 'Broadcast', 'Columnwise', COLBTOP ) CALL PB_TOPSET( ICTXT, 'Combine', 'Columnwise', COLCTOP ) RETURN ENDEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Multithreading for Multi-Core Systems
- InhaltsvorschauThe advent of multi-core chips brought about a fundamental shift in the way software is produced. Dense linear algebra is no exception. The good news is that LAPACK's LU factorization runs on a multi-core system and can even deliver a modest increase of performance if multithreaded BLAS are used. In technical terms, this is the fork-join model of computation: each call to BLAS (from a single main thread) forks a suitable number of threads, which perform the work on each core and then join the main thread of computation. The fork-join model implies a synchronization point at each join operation.The bad news is that the LAPACK's fork-join algorithm gravely impairs scalability even on small multi-core computers that do not have the memory systems available in SMP systems. The inherent scalability flaw is the heavy synchronization in the fork-join model (only a single thread is allowed to perform the significant computation that occupies the critical section of the code, leaving other threads idle) that results in lock-step execution and prevents hiding of inherently sequential portions of the code behind parallel ones. In other words, the threads are forced to perform the same operation on different data. If there is not enough data for some threads, they will have to stay idle and wait for the rest of the threads that perform useful work on their data. Clearly, another version of the LU algorithm is needed such that would allow threads to stay busy all the time by possibly making them perform different operations during some portion of the execution.The multithreaded version of the algorithm recognizes the existence of a so-called critical path in the algorithm: a portion of the code whose execution depends on previous calculations and can block the progress of the algorithm. The LAPACK's LU does not treat this critical portion of the code in any special way: the DGETF2 subroutine is called by a single thread and doesn't allow much parallelization even at the BLAS level. While one thread calls this routine, the other ones wait idly. And since the performance of DGETF2 is bound by memory bandwidth (rather than processor speed), this bottleneck will exacerbate scalability problems as systems with more cores are introduced.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- A Word About the Error Analysis and Operation Count
- InhaltsvorschauThe key aspect of all of the implementations presented in this chapter is their numerical properties.It is acceptable to forgo elegance in order to gain performance. But numerical stability is of vital importance and cannot be sacrificed because it is an inherent part of the algorithm's correctness. While these are serious considerations, there is some consolation to follow. It may be surprising to some readers that all of the algorithms presented are the same, even though it's virtually impossible to make each excerpt of code produce exactly the same output for exactly the same inputs.When it comes to repeatability of results, the vagaries of floating-point representation may be captured in a rigorous way by error bounds. One way of expressing the numerical robustness of the previous algorithms is with the following formula:
||r||/||A|| ≤ ||e|| ≤ ||A-1|| ||r||
where error e = x – y is the difference between the computed solutionyand the correct solutionx, andr= Ay – b is a so-called "residual." The previous formula basically says that the size of the error (the parallel bars surrounding a value indicate a norm—a measure of absolute size) is as small as warranted by the quality of the matrix A. Therefore, if the matrix is close to being singular in numerical sense (some entries are so small that they might as well be considered to be zero), the algorithms will not give an accurate answer. But, otherwise, a relatively good quality of result can be expected.Another feature that is common to all the versions presented is the operation count: they all perform 2/3n3 floating-point multiplications and/or additions. The order of these operations is what differentiates them. There are algorithms that increase the amount of floating-point work to save on memory traffic or network transfers (especially for distribute-memory parallel algorithms). But because the algorithms shown in this chapter have the same operation count, it is valid to compare them for performance. The computational rate (number of floating-point operations per second) may be used instead of the time taken to solve the problem, provided that the matrix size is the same. But comparing computational rates is sometimes better because it allows a comparison of algorithms when the matrix sizes differ. For example, a sequential algorithm on a single processor can be directly compared with a parallel one working on a large cluster on a much bigger matrix.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Future Directions for Research
- InhaltsvorschauIn this chapter, we have looked at the evolution of the design of a simple but important algorithm in computational science. The changes over the past 30 years have been necessary to follow the lead of the advances in computer architectures. In some cases, these changes have been simple, such as interchanging loops. In other cases, they have been as complex as the introduction of recursion and look-ahead computations. In each case, however, the code's ability to efficiently utilize the memory hierarchy is the key to high performance on a single processor as well as on shared and distributed memory systems.The essence of the problem is the dramatic increase in complexity that software developers have had to confront, and still do. Dual-core machines are already common, and the number of cores is expected to roughly double with each processor generation. But contrary to the assumptions of the old model, programmers will not be able to consider these cores independently (i.e., multi-core is not "the new SMP") because they share on-chip resources in ways that separate processors do not. This situation is made even more complicated by the other nonstandard components that future architectures are expected to deploy, including the mixing of different types of cores, hardware accelerators, and memory systems.Finally, the proliferation of widely divergent design ideas shows that the question of how to best combine all these new resources and components is largely unsettled. When combined, these changes produce a picture of a future in which programmers will have to overcome software design problems vastly more complex and challenging than those in the past in order to take advantage of the much higher degrees of concurrency and greater computing power that new architectures will offer.So, the bad news is that none of the presented code will work efficiently someday. The good news is that we have learned various ways to mold the original simple rendition of the algorithm to meet the ever-increasing challenges of hardware designs.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Further Reading
- Inhaltsvorschau
-
LINPACK User's Guide7, J. J. Dongarra, J. R. Bunch, C. B. Moler, and G. W. Stewart, SIAM: Philadelphia, 1979, ISBN 0-89871-172-X.
-
LAPACK Users' Guide, Third Edition, E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammaring, A. McKenney, and D. Sorensen, SIAM: Philadelphia, 1999, ISBN 0-89871-447-8.
-
ScaLAPACK Users' Guide, L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley, SIAM Publications, Philadelphia, 1997, ISBN 0-89871-397-8.
-
"Basic Linear Algebra Subprograms for FORTRAN usage," C. L. Lawson, R. J. Hanson, D. Kincaid, and F. T. Krogh, ACM Trans. Math. Soft., Vol. 5, pp. 308–323, 1979.
-
"An extended set of FORTRAN Basic Linear Algebra Subprograms," J. J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson, ACM Trans. Math. Soft., Vol. 14, pp. 1–17, 1988.
-
"A set of Level 3 Basic Linear Algebra Subprograms," J. J. Dongarra, J. Du Croz, I. S. Duff, and S. Hammarling, ACM Trans. Math. Soft., Vol. 16, pp. 1–17, 1990.
-
Implementation Guide for LAPACK, E. Anderson and J. Dongarra, UT-CS-90-101, April 1990.
-
A Proposal for a Set of Parallel Basic Linear Algebra Subprograms, J. Choi, J. Dongarra, S. Ostrouchov, A. Petitet, D. Walker, and R. C. Whaley, UT-CS-95-292, May 1995.
-
LAPACK Working Note 37: Two Dimensional Basic Linear Algebra Communication Subprograms, J. Dongarra and R. A. van de Geijn, University of Tennessee Computer Science Technical Report, UT-CS-91-138, October 1991.
-
"Matrix computations with Fortran and paging," Cleve B. Moler,
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Chapter 15: The Long-Term Benefits of Beautiful Design
- InhaltsvorschauAdam KolawaSome algorithms for seemingly straightforward and simplemathematical equations are actually extremely difficult to implement. For instance, rounding problems can compromise accuracy, some mathematical equations can cause values to exceed the range of a floating-point value on the system, and some algorithms (notably the classic Fourier Transform) take much too long if done in a brute-force fashion. Furthermore, different sets of data work better with different algorithms. Consequently, beautiful code and beautiful mathematics are not necessarily one and the same.The programmers who wrote the code for the CERN mathematical library recognized the difference between mathematical equations and computed solutions: the difference between theory and practice. In this chapter, I will explore the beauty in a few of the programming strategies that they used to bridge that gap.My idea of beautiful code stems from my belief that the ultimate purpose of code is to work. In other words, code should accurately and efficiently perform the task that it was designed to complete, in such a way that there are no ambiguities as to how it will behave.I find beauty in code that I can trust—code that I am confident will produce results that are correct and applicable to my problem. What I am defining here as my first criterion of beautiful code is code that I can use and reuse without any shred of doubt in the code's ability to deliver results. In other words, my primary concern is not what the code looks like, but what I can do with it.It's not that I don't appreciate the beauty in code's implementation details; I do indeed, and I will discuss criteria and examples of code's inner beauty later in this chapter. My point here is that when code satisfies my somewhat nontraditional notion of beauty in utility, it's rarely necessary to look at its implementation details. Such code promotes what I believe to be one of the most important missions in the industry: the ability to share code with others, without requiring them to analyze the code and figure out exactly how it works. Beautiful code is like a beautiful car. You rarely need to open it up to look at its mechanics. Rather, you enjoy it from the outside and trust that it will drive you where you want to go.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- My Idea of Beautiful Code
- InhaltsvorschauMy idea of beautiful code stems from my belief that the ultimate purpose of code is to work. In other words, code should accurately and efficiently perform the task that it was designed to complete, in such a way that there are no ambiguities as to how it will behave.I find beauty in code that I can trust—code that I am confident will produce results that are correct and applicable to my problem. What I am defining here as my first criterion of beautiful code is code that I can use and reuse without any shred of doubt in the code's ability to deliver results. In other words, my primary concern is not what the code looks like, but what I can do with it.It's not that I don't appreciate the beauty in code's implementation details; I do indeed, and I will discuss criteria and examples of code's inner beauty later in this chapter. My point here is that when code satisfies my somewhat nontraditional notion of beauty in utility, it's rarely necessary to look at its implementation details. Such code promotes what I believe to be one of the most important missions in the industry: the ability to share code with others, without requiring them to analyze the code and figure out exactly how it works. Beautiful code is like a beautiful car. You rarely need to open it up to look at its mechanics. Rather, you enjoy it from the outside and trust that it will drive you where you want to go.For code to be enjoyed in this way, it must be designed so that it's clear how it should be used, easy to understand how you can apply it to solve your own problems, and easy to verify if you are using it correctly.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Introducing the CERN Library
- InhaltsvorschauI believe that the mathematical library developed by CERN (European Organization for Nuclear Research) is a prime example of beautiful code. The code in this library performs algebraic operations, integrates functions, solves differential equations, and solves a lot of physics problems. It was written over 30 years ago and has been widely used over the years. The linear algebra part of the CERN Library evolved into the LAPACK library, which is the code that I will discuss here. The LAPACK library is currently developed by multiple universities and organizations.I have used this code since I was young, and I am still fascinated by it. The beauty of this code is that it contains many complicated mathematical algorithms that are very well tested and difficult to reproduce. You can reuse it and rely on it without having to worry about whether it works.The library's high accuracy and reliability are why—even after all these years—it remains the mathematical library of choice for anyone who needs to accurately and reliably solve equations. There are other mathematical libraries available, but they can't compete with the CERN library's proven reliability and accuracy.The first part of this chapter discusses this code's outer beauty: the things that make it so accurate and reliable that developers want to reuse it and the elements that make this reuse as simple as possible. The second part explores the inner beauty of its implementation details.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Outer Beauty
- InhaltsvorschauIf you've ever tried to solve a system of linear equations or perform an equally complicated mathematical operation, you know that many times the code you write to achieve this does not deliver the correct results. One of the greatest problems with mathematical libraries is that rounding errors and floating-point operations lead to solution instabilities and incorrect results.If you design a mathematical library, you need to carefully define the range in which each algorithm will work. You need to write each algorithm in such a way that it will adhere to these conditions, and you also need to write it in such a way that the rounding errors will cancel out. This can be very complicated.In the CERN library, the algorithms are specified in a very precise way. Basically, if you look at any routine, you will notice that it has a description of what it is going to do. It really doesn't matter in which language the routine is written. In fact, these routines were written in Fortran but have interfaces that allow them to be called from almost any other place. That's also a beautiful thing. In some sense, the routine is a black box: you don't care what goes on inside, only that it delivers the appropriate results for your inputs. It carefully defines what every routine is doing, under which conditions it is working, what input data it accepts, and what constraints must be put on the input data in order to get the correct answer.For example, let's look at the LAPACK library's SGBSV routine, which solves a system of linear equations for a banded matrix. If you try to solve a system of linear equations numerically, you use different algorithms. Different algorithms operate better in different domains, and you need to know the structure of the matrix to choose the best one. For instance, you would want to use one algorithm to solve the problem if you had a banded matrix (a matrix where most of the elements are around the diagonal), and a different one if you had a sparse matrix (a matrix that has a lot of zeros and few numbers).Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Inner Beauty
- InhaltsvorschauNow, let's start looking at the code's implementation details.I believe that beautiful code is short code, and I find that lengthy, complicated code is generally quite ugly. The SGBSV routine is a prime example of the beauty of short code. It begins with a quick verification of the consistency of the input arguments, then continues with two calls that logically follow the mathematical algorithm.From the first glance, it is obvious what this code is doing: it begins by performing LU factorization with the SGBTRF routine, then solves the system with the SGBTRS routine. This code is very easy to read. There's no need to pore over hundreds of lines of code to understand what the code does. The main task is split into two subtasks, and the subtasks are pushed into a subsystem.Note that the subsystem adheres to the same design assumptions regarding memory usage as the main system. This is a very important and beautiful aspect of the design.The routines from the subsystem are reused in different "driver" routines (the SGBSV routine is called a driver routine). This creates a hierarchical system that encourages code reuse. This is beautiful, too. Code reuse significantly reduces the effort required for code development, testing, and maintenance. In fact, it is one of the best ways to increase developers' productivity and reduce their stress. The problem is that reuse is typically difficult. Very often, code is so complicated and difficult to read that developers find it easier to rewrite code from scratch than to reuse somebody else's code. Good design and clear, concise code are vital to promoting code reuse.Unfortunately, much of the code written today falls short in this respect. Nowadays, most of the code written has an inheritance structure, which is encouraged in the hope that it will bring clarity to the code. However, I must admit that I've spent hours on end staring at a few lines of such code…and still could not decipher what it was supposed to do. This is not beautiful code; it is bad code with a convoluted design. If you cannot tell what the code does by glancing at the naming conventions and several code lines, then the code is too complicated.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Conclusion
- InhaltsvorschauIn sum, I believe that beautiful code must be short, explicit, frugal, and written with consideration for reality. However, I think that the true test of beauty—for code as well as art—is whether the work stands the test of time. Lots of code has been written over the years, but little of it was still in use even several years after it was written. That the CERN library code is still in use more than 30 years after it was written confirms that it is truly beautiful.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 16: The Linux Kernel Driver Model: The Benefits of Working Together
- InhaltsvorschauGreg Kroah-HartmanThe linux kernel driver model attempts to create a system-wide tree of all different types of devices managed by the operating system. The core data structures and code used to do this have changed over the years from a very simplistic system meant for handling a few devices to a highly scalable system that can control every different type of device that the real world needs to interact with.As the Linux kernel has evolved over the years, handling more and more different types of devices, the core of the kernel has had to change and evolve in order to come up with easier and more manageable ways to handle the range of device types.Almost all devices consist of two different portions: the physical portion that defines how the operating system talks to the device (be it through the PCI bus, SCSI bus, ISA bus, USB bus, etc.) and the virtual portion that defines how the operating system presents the device to the user so that it can be operated properly (keyboard, mouse, video, sound, etc.). Through the 2.4 kernel releases, each physical portion of devices was controlled by a bus-specific portion of code. This bus code was responsible for a wide range of different tasks, and each individual bus code had no interaction with any other bus code.In 2001, Pat Mochel was working on solving the issue of power management in the Linux kernel. He came to realize that in order to shut down or power up individual devices properly, the kernel had to know the linkages between the different devices. For example, a USB disk drive should be shut down before the PCI controller card for the USB controller is shut down, in order to properly save the data on the device. To solve this issue, the kernel needed to know the tree of all devices in the system, showing what device controlled what other device, and the order in which everything was connected.Around the same time, I was running into another device-related problem: Linux did not properly handle devices in a persistent manner. I wanted my two USB printers to always have the same name, no matter which one was turned on first, or the order in which they were discovered by the Linux kernel.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Humble Beginnings
- InhaltsvorschauTo start with, a simple structure,
struct device, was created as the "base" class for all devices in the kernel. This structure started out looking like the following:struct device { struct list_head node; /* node in sibling list */ struct list_head children; struct device *parent; char name[DEVICE_NAME_SIZE]; /* descriptive ascii string */ char bus_id[BUS_ID_SIZE]; /* position on parent bus */ spinlock_t lock; /* lock for the device to ensure two different layers don't access it at the same time. */ atomic_t refcount; /* refcount to make sure the device * persists for the right amount of time */ struct driver_dir_entry * dir; struct device_driver *driver; /* which driver has allocated this device */ void *driver_data; /* data private to the driver */ void *platform_data; /* Platform-specific data (e.g. ACPI, BIOS data relevant to device */ u32 current_state; /* Current operating state. In ACPI-speak, this is D0-D3, D0 being fully functional,and D3 being off. */ unsigned char *saved_state; /* saved device state */ };Every time this structure was created and registered with the kernel driver core, a new entry in a virtual filesystem was created that showed the device and any different attributes it contained. This allowed all devices in the system to be shown to userspace, in the order in which they were connected. This virtual filesystem is now called sysfs and can be seen on a Linux machine in the /sys/devices directory. An example of this structure showing a few PCI and USB devices follows:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Reduced to Even Smaller Bits
- InhaltsvorschauAs the initial driver core rework was happening, another kernel developer, Al Viro, was working on fixing a number of issues regarding object reference counting in the virtual filesystem layer.The main problem with structures in multithreaded programs written in the C language is that it's very hard to determine when it is safe to free up any memory used by a structure. The Linux kernel is a massively multithreaded program that must properly handle hostile users as well as large numbers of processors all running at the same time. Because of this, reference counting on any structure that can be found by more than one thread is a necessity.The
struct devicestructure was one such reference-counted structure. It had a single field that was used to determine when it was safe to free the structure:atomic_t refcount;
When the structure was initialized, this field was set to 1. Whenever any code wished to use the structure, it had to first increment the reference count by calling the functionget_ device, which checked that the reference count was valid and incremented the reference count of the structure:static inline void get_device(struct device * dev) { BUG_ON(!atomic_read(&dev->refcount)); atomic_inc(&dev->refcount); }Similarly, when a thread was finished with the structure, it decremented the reference count by callingput_device, which was a bit more complex:void put_device(struct device * dev) { if (!atomic_dec_and_lock(&dev->refcount,&device_lock)) return; ... /* Tell the driver to clean up after itself. * Note that we likely didn't allocate the device, *so this is the driver's chance to free that up... */ if (dev->driver && dev->driver->remove) dev->driver->remove(dev,REMOVE_FREE_RESOURCES); }This function decremented the reference count and then, if it was the last user of the object, would tell the object to clean itself up and call a function that was previously set up to free it from the system.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Scaling Up to Thousands of Devices
- InhaltsvorschauAs Linux runs on everything from cellphones, radio-controlled helicopters, desktops, and servers to 73 percent of the world's largest supercomputers, scaling the driver model was very important and always in the backs of our minds. As development progressed, it was nice to see that the basic structures used to hold devices,
struct kobjectandstruct devices, were relatively small. The number of devices connected to most systems is directly proportional to the size of the system. So small, embedded systems had only a few—one to ten— different devices connected and in their device tree. Larger "enterprise" systems had many more devices connected, but these systems also had a lot of memory to spare, so the increased number of devices was still only a very small proportion of the kernel's overall memory usage.This comfortable scaling model, unfortunately, was found to be completely false when it came to one class of "enterprise" system, the s390 mainframe computer. This computer could run Linux in a virtual partition (up to 1,024 instances at the same time on a single machine) and had a huge number of different storage devices connected to it. Overall, the system had a lot of memory, but each virtual partition would have only a small slice of that memory. Each virtual partition wanted to see all different storage devices (20,000 could be typical), while only being allocated a few hundred megabytes of RAM.On these systems, the device tree was quickly found to suck up a huge percentage of memory that was never released back to the user processes. It was time to put the driver model on a diet, and some very smart IBM kernel developers went to work on the problem.What the developers found was initially surprising. It turned out that the mainstruct devicestructure was only around 160 bytes (for a 32-bit processor). With 20,000 devices in the system, that amounted to only 3 to 4 MB of RAM being used, a very manageable usage of memory. The big memory hog was the RAM-based filesystem mentioned earlier, sysfs, which showed all of these devices to userspace. For every device,Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Small Objects Loosely Joined
- InhaltsvorschauThe Linux driver model shows how the C language can be used to create a heavily object-oriented model of code by creating numerous small objects, all doing one thing well. These objects can be embedded in other objects, all without any runtime type identification, creating a very powerful and flexible object tree. And based on the real-world usage of these objects, their memory footprint is minimal, allowing the Linux kernel to be flexible enough to work from the same code base for tiny embedded systems up to the largest supercomputers in the world.The development of this model also shows two very interesting and powerful aspects of the way Linux kernel development works.First, the process is very iterative. As the requirements of the kernel change, and the systems on which it runs also change, the developers have found ways to abstract different parts of the model to make it more efficient where it is needed. This is responding to a basic evolutionary need of the system to survive in these environments.Second, the history of device handling shows that the process is extremely collaborative. Different developers come up independently with ideas for improving and extending different aspects of the kernel. Through the source code, others can then see the goals of the developers exactly as they describe them, and then help change the code in ways the original developers never considered. The end result is something that meets the goals of many different developers by finding a common solution that would not have been seen by any one individual.These two characteristics of development have helped Linux evolve into the most flexible and powerful operating system ever created. And they ensure that as long as this type of development continues, Linux will remain this way.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 17: Another Level of Indirection
- InhaltsvorschauDiomidis SpinellisAll problems in computer science can be solved by another level of indirection," is a famous quote attributed to Butler Lampson, the scientist who in 1972 envisioned the modern personal computer. The quote rings in my head on various occasions: when I am forced to talk to a secretary instead of the person I wish to communicate with, when I first travel east to Frankfurt in order to finally fly west to Shanghai or Bangalore, and—yes—when I examine a complex system's source code.Let's start this particular journey by considering the problem of a typical operating system that supports disparate filesystem formats. An operating system may use data residing on its native filesystem, a CD-ROM, or a USB stick. These storage devices may, in turn, employ different filesystem organizations: NTFS or ext3fs for a Windows or Linux native filesystem, ISO-9660 for the CD-ROM, and, often, the legacy FAT-32 filesystem for the USB stick. Each filesystem uses different data structures for managing free space, for storing file metadata, and for organizing files into directories. Therefore, each filesystem requires different code for each operation on a file (
open, read, write, seek, close, delete, and so on).I grew up in an era where different computers more often than not had incompatible filesystems, forcing me to transfer data from one machine to another over serial links. Therefore, the ability to read on my PC a flash card written on my camera never ceases to amaze me. Let's consider how the operating system would structure the code for accessing the different filesystems. One approach would be to employ aswitchstatement for each operation. Consider as an example a hypothetical implementation of thereadsystem call under the FreeBSD operating system. Its kernel-side interface would look as follows:int VOP_READ( struct vnode *vp, /* File to read from */ struct uio *uio, /* Buffer specification */ int ioflag, /* I/O-specific flags */ struct ucred *cred) /* User's credentials */ { /* Hypothetical implementation */ switch (vp->filesystem) { case FS_NTFS: /* NTFS-specific code */ case FS_ISO9660: /* ISO-9660-specific code */ case FS_FAT32: /* FAT-32-specific code */ /* [...] */ } }Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - From Code to Pointers
- InhaltsvorschauI grew up in an era where different computers more often than not had incompatible filesystems, forcing me to transfer data from one machine to another over serial links. Therefore, the ability to read on my PC a flash card written on my camera never ceases to amaze me. Let's consider how the operating system would structure the code for accessing the different filesystems. One approach would be to employ a
switchstatement for each operation. Consider as an example a hypothetical implementation of thereadsystem call under the FreeBSD operating system. Its kernel-side interface would look as follows:int VOP_READ( struct vnode *vp, /* File to read from */ struct uio *uio, /* Buffer specification */ int ioflag, /* I/O-specific flags */ struct ucred *cred) /* User's credentials */ { /* Hypothetical implementation */ switch (vp->filesystem) { case FS_NTFS: /* NTFS-specific code */ case FS_ISO9660: /* ISO-9660-specific code */ case FS_FAT32: /* FAT-32-specific code */ /* [...] */ } }This approach would bundle together code for the various filesystems, limiting modularity. Worse, adding support for a new filesystem type would require modifying the code of each system call implementation and recompiling the kernel. Moreover, adding a processing step to all the operations of a filesystem (for example, the mapping of remote user credentials) would also require the error-prone modification of each operation with the same boilerplate code.As you might have guessed, our task at hand calls for some additional levels of indirection. Consider how the FreeBSD operating system—a code base I admire for the maturity of its engineering—solves these problems. Each filesystem defines the operations that it supports as functions and then initializes avop_vectorstructure with pointers to them. Here are some fields of theEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - From Function Arguments to Argument Pointers
- InhaltsvorschauMost Unix-related operating systems, such as FreeBSD, Linux, and Solaris, use function pointers to isolate the implementation of a filesystem from the code that accesses its contents. Interestingly, FreeBSD also employs indirection to abstract the read function's arguments.When I first encountered the call
vop->vop_read(a), shown in the previous section, I asked myself what that a argument was and what happened to the original four arguments of the hypothetical implementation of theVOP_READfunction we saw earlier. After some digging, I found that the kernel uses another level of indirection to layer filesystems on top of each other to an arbitrary depth. This layering allows a filesystem to offer some services (such as translucent views, compression, and encryption) based on the services of another underlying filesystem. Two mechanisms work cleverly together to support this feature: one allows a single bypass function to modify the arguments of anyvop_vectorfunction, while another allows all undefinedvop_vectorfunctions to be redirected to the underlying filesystem layer.You can see both mechanisms in action in . The figure illustrates three file-systems layered on top of one another. On top lies the umapfs filesystem, which the system administrator mounted in order to map user credentials. This is valuable if the system where this particular disk was created used different user IDs. For instance, the administrator might want user ID 1013 on the underlying filesystem to appear as user ID 5325.
Figure 17-2: Example of filesystem layeringBeneath the top filesystem lies the Berkeley Fast Filesystem (ffs), the time- and space-efficient filesystem used by default in typical FreeBSD installations. The ffs in turn, for most of its operations, relies on the code of the original 4.2 BSD filesystem implementationEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - From Filesystems to Filesystem Layers
- InhaltsvorschauFor a concrete example of filesystem layering, consider the case where you mount on your computer a remote filesystem using the NFS (Network File System) protocol. Unfortunately, in your case, the user and group identifiers on the remote system don't match those used on your computer. However, by interposing a umapfs filesystem over the actual NFS implementation, we can specify through external files the correct user and group mappings. illustrates how some operating system kernel function calls first get routed through the bypass function of umapfs—
umap_bypass—before continuing their journey to the corresponding NFS client functions.In contrast to thenull_bypassfunction, the implementation ofumap_bypassactually does some work before making a call to the underlying layer. Thevop_generic_argsstructure passed as its argument contains a description of the actual arguments for each vnode operation:/* * A generic structure. * This can be used by bypass routines to identify generic arguments. */ struct vop_generic_args { structvnodeop_desc *a_desc; /* other random data follows, presumably */ }; /* * This structure describes the vnode operation taking place. */ struct vnodeop_desc { char *vdesc_name; /* a readable name for debugging */ int vdesc_flags; /* VDESC_* flags */ vop_bypass_t *vdesc_call; /* Function to call */ /* * These ops are used by bypass routines to map and locate arguments. * Creds and procs are not needed in bypass routines, but sometimes * they are useful to (for example) transport layers. * Nameidata is useful because it has a cred in it. */ int *vdesc_vp_offsets; /* list ended by VDESC_NO_OFFSET */ int vdesc_vpp_offset /* return vpp location */ int vdesc_cred_offset; /* cred location, if any */ int vdesc_thread_offset /* thread location, if any * int vdesc_componentname_offset; /* if any */ };Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - From Code to a Domain-Specific Language
- InhaltsvorschauYou may have noticed that some of the code associated with the implementation of the read system call, such as the packing of its arguments into a structure or the logic for calling the appropriate function, is highly stylized and is probably repeated in similar forms for all 52 other interfaces. Another implementation detail, which we have not so far discussed and which can keep me awake at nights, concerns locking.Operating systems must ensure that various processes running concurrently don't step on each other's toes when they modify data without coordination between them. On modern multithreaded, multi-core processors, ensuring data consistency by maintaining one mutual exclusion lock for all critical operating system structures (as was the case in older operating system implementations) would result in an intolerable drain on performance. Therefore, locks are nowadays held over fine-grained objects, such as a user's credentials or a single buffer. Furthermore, because obtaining and releasing locks can be expensive operations, ideally once a lock is held it should not be released if it will be needed again in short order. These locking specifications can best be described through preconditions (what the state of a lock must be before entering a function) and postconditions (the state of the lock at a function's exit).As you can imagine, programming under those constraints and verifying the code's correctness can be hellishly complicated. Fortunately for me, another level of indirection can be used to bring some sanity into the picture. This indirection handles both the redundancy of packing code and the fragile locking requirements.In the FreeBSD kernel, the interface functions and data structures we've examined, such as
VOP_READ_AP, VOP_READ_APV, andvop_read_desc, aren't directly written in C. Instead, a domain-specific language is used to specify the types of each call's arguments and their locking pre- and postconditions. Such an implementation style always raises my pulse, because the productivity boost it gives can be enormous. Here is an excerpt from the read system call specification:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Multiplexing and Demultiplexing
- InhaltsvorschauAs you can see back in , the processing of the read system call doesn't start from
VOP_READ. VOP_READis actually called fromvn_read, which itself is called through a function pointer.This level of indirection is used for another purpose. The Unix operating system and its derivatives treat all input and output sources uniformly. Thus, instead of having separate system calls for reading from, say, a file, a socket, or a pipe, thereadsystem call can read from any of those I/O abstractions. I find this design both elegant and useful; I've often relied on it, using tools in ways their makers couldn't have anticipated. (This statement says more about the age of the tools I use than my creativity.)The indirection appearing in the middle of is the mechanism FreeBSD uses for providing this high-level I/O abstraction independence. Associated with each file descriptor is a function pointer leading to the code that will service the particular request:pipe_readfor pipes,soo_readfor sockets,mqf_readfor POSIX message queues,kqueue_readfor kernel event queues, and, finally,vn_readfor actual files.So far, in our example, we have encountered two instances where function pointers are used to dispatch a request to different functions. Typically, in such cases, a function pointer is used to demultiplex a single request to multiple potential providers. This use of indirection is so common that it forms an important element of object-oriented languages, in the form of dynamic dispatch to various subclass methods. To me, the manual implementation of dynamic dispatch in a procedural language like C is a distinguishing mark of an expert programmer. (Another is the ability to write a structured program in assembly language or Fortran.)Indirection is also often introduced as a way to factor common functionality. Have a look at the top of . Modern Unix systems have four variants of the vanillaEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Layers Forever?
- InhaltsvorschauWe could continue looking at more code examples forever, so it is worth bringing our discussion to an end by noting that Lampson attributes the aphorism that started our exploration (all problems in computer science can be solved by another level of indirection) to David Wheeler, the inventor of the subroutine. Significantly, Wheeler completed his quote with another phrase: "But that usually will create another problem." Indeed, indirection and layering add space and time overhead, and can obstruct the code's comprehensibility.The time and space overhead is often unimportant, and should rarely concern us. In most cases, the delays introduced by an extra pointer lookup or subroutine call are insignificant in the greater scheme of things. In fact, nowadays the tendency in modern programming languages is for some operations to always happen through a level of indirection in order to provide an additional measure of flexibility. Thus, for example, in Java and C#, almost all accesses to objects go through one pointer indirection, to allow for automatic garbage collection. Also, in Java, almost all calls to instance methods are dispatched through a lookup table, in order to allow inheriting classes to override a method at runtime.Despite these overheads that burden all object accesses and method calls, both platforms are doing fine in the marketplace, thank you very much. In other cases, compilers optimize away the indirection we developers put in our code. Thus, most compilers detect cases where calling a function is more expensive than substituting its code inline, and automatically perform this inlining.Then again, when we're operating at the edge of performance, indirection can be a burden. One trick that developers trying to feed gigabit network interfaces use to speed up their code is to combine functionality of different levels of the network stack, collapsing some layers of abstraction. But these are extreme cases.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 18: Python's Dictionary Implementation: Being All Things to All People
- InhaltsvorschauAndrew KuchlingDictionaries are a fundamental data type in the python programming language. Like awk's associative arrays and Perl's hashes, dictionaries store a mapping of unique keys to values. Basic operations on a dictionary include:
-
Adding a new key/value pair
-
Retrieving the value corresponding to a particular key
-
Removing existing pairs
-
Looping over the keys, values, or key/value pairs
Here's a brief example of using a dictionary at the Python interpreter prompt. (To try out this example, you can just run the python command on Mac OS and most Linux distributions. If Python isn't already installed, you can download it from http://www.python.org.)In the following interactive session, the >>> signs represent the Python interpreter's prompts, anddis the name of the dictionary I'm playing with:>>>d = {1: 'January', 2: 'February', ... 'jan': 1, 'feb': 2, 'mar': 3} {'jan': 1, 1: 'January', 2: 'February', 'mar': 3, 'feb': 2} >>> d['jan'], d[1] (1, 'January') >>> d[12] Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 12 >>> del d[2] >>> for k, v in d.items(): print k,v # Looping over all pairs. jan 1 1 January mar 3 >feb 2 ...
Two things to note about Python's dictionary type are:-
A single dictionary can contain keys and values of several different data types. It's legal to store the keys
1, 3+4j(a complex number), and"abc"(a string) in the same dictionary. Values retain their type; they aren't all converted to strings. -
Keys are not ordered. Methods such as
.values()that return the entire contents of a dictionary will return the data in some arbitrary arrangement, not ordered by value or by insertion time.
It's important that retrieval of keys be a very fast operation, so dictionary-like types are usually implemented as hash tables. For the C implementation of Python (henceforth referred to as CPython), dictionaries are even more pivotal because they underpin several other language features. For example, classes and class instances use a dictionary to store their attributes:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Inside the Dictionary
- InhaltsvorschauDictionaries are represented by a C structure,
PyDictObject, defined in Include/dictobject.h. Here's a schematic of the structure representing a small dictionary mapping"aa", "bb", "cc", …, "mm"to the integers 1 to 13:int ma_fill 13 int ma_used 13 int ma_mask 31 PyDictEntry ma_table[]: [0]: aa, 1 hash(aa) == -1549758592, -1549758592 & 31 = 0 [1]: ii, 9 hash(ii) == -1500461680, -1500461680 & 31 = 16 [2]: null, null [3]: null, null [4]: null, null [5]: jj, 10 hash(jj) == 653184214, 653184214 & 31 = 22 [6]: bb, 2 hash(bb) == 603887302, 603887302 & 31 = 6 [7]: null, null [8]: cc, 3 hash(cc) == -1537434360, -1537434360 & 31 = 8 [9]: null, null [10]: dd, 4 hash(dd) == 616211530, 616211530 & 31 = 10 [11]: null, null [12]: null, null [13]: null, null [14]: null, null [15]: null, null [16]: gg, 7 hash(gg) == -1512785904, -1512785904 & 31 = 16 [17]: ee, 5 hash(ee) == -1525110136, -1525110136 & 31 = 8 [18]: hh, 8 hash(hh) == 640859986, 640859986 & 31 = 18 [19]: null, null [20]: null, null [21]: kk, 11 hash(kk) == -1488137240, -1488137240 & 31 = 8 [22]: ff, 6 hash(ff) == 628535766, 628535766 & 31 = 22 [23]: null, null [24]: null, null [25]: null, null [26]: null, null [27]: null, null [28]: null, null [29]: ll, 12 hash(ll) == 665508394, 665508394 & 31 = 10 [30]: mm, 13 hash(mm) == -1475813016, -1475813016 & 31 = 8 [31]: null, null
Thema_prefix in the field names comes from the word mapping, Python's term for data types that provide key/value lookups. The fields in the structure are:ma_used-
Number of slots occupied by keys (in this case, 13).
ma_fill-
Number of slots occupied by keys or by dummy entries (also 13).
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Special Accommodations
- InhaltsvorschauWhen trying to be all things to all people—a time- and memory-efficient data type for Python users, an internal data structure used as part of the interpreter's implementation, and a readable and maintainable code base for Python's developers—it's necessary to complicate a pure, theoretically elegant implementation with special-case code for particular cases… but not too much.The
PyDictObjectalso contains space for an eight-slot hash table. Small dictionaries with five elements or fewer can be stored in this table, saving the time cost of an extramalloc()call. This also improves cache locality; for example,PyDictObjectstructures occupy 124 bytes of space when using x86 GCC and therefore can fit into two 64-byte cache lines. The dictionaries used for keyword arguments most commonly have one to three keys, so this optimization helps improve function-call performance.As previously explained, a single dictionary can contain keys of several different data types. In most Python programs, the dictionaries underlying class instances and modules have only strings as keys. It's natural to wonder whether a specialized dictionary object that only accepted strings as keys might provide benefits. Perhaps a special-case data type would be useful and make the interpreter run faster?Section 18.2.2.1: The Java implementation: another special-case optimization
In fact, there is a string-specialized dictionary type in Jython (http://www.jython.org), an implementation of Python in Java. Jython has anorg.python.org.PyStringMapclass used only for dictionaries in which all keys are strings; it is used for the_ _dict_ _dictionary underpinning class instances and modules. Jython code that creates a dictionary for user code employs a different class,org.python.core.PyDictionary, a heavyweight object that uses aEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Collisions
- InhaltsvorschauFor any hash table implementation, an important decision is what to do when two keys hash to the same slot. One approach is chaining (see http://en.wikipedia.org/wiki/Hash_table#Chaining): each slot is the head of a linked list containing all the items that hash to that slot. Python doesn't take this approach because creating linked lists would require allocating memory for each list item, and memory allocations are relatively slow operations. Following all the linked-list pointers would also probably reduce cache locality.The alternative approach is open addressing (see http://en.wikipedia.org/wiki/Hash_table#Open_addressing): if the first slot
ithat is tried doesn't contain the key, other slots are tried in a fixed pattern. The simplest pattern is called linear probing: if slotiis full, tryi+1, i+2, i+3, and so on, wrapping around to slot 0 when the end of the table is reached. Linear probing would be wasteful in Python because many programs use consecutive integers as keys, resulting in blocks of filled slots. Linear probing would frequently scan these blocks, resulting in poor performance. Instead, Python uses a more complicated pattern:/* Starting slot */ slot = hash; /* Initial perturbation value */ perturb = hash; while (<slot is full> && <item in slot doesn't equal the key>) { slot = (5*slot) + 1 + perturb; perturb >>= 5; }In the C code,5*slotis written using bit shifts and addition as(slot<<2) + slot. The perturbation factorperturbstarts out as the full hash code; its bits are then progressively shifted downward 5 bits at a time. This shift ensures that every bit in the hash code will affect the probed slot index fairly quickly. Eventually the perturbation factor becomes zero, and the pattern becomes simplyslot=(5*slot)+1. This eventually generates every integer between 0 andma_mask, so the search is guaranteed to eventually find either the key (on a search operation) or an empty slot (on an insert operation).Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Resizing
- InhaltsvorschauThe size of a dictionary's hash table needs to be adjusted as keys are added. The code aims to keep the table two-thirds full; if a dictionary is holding n keys, the table must have at least n/ (2/3) slots. This ratio is a trade-off: filling the table more densely results in more collisions when searching for a key, but uses less memory and therefore fits into cache better. Experiments have been tried where the 2/3 ratio is adjusted depending on the size of the dictionary, but they've shown poor results; every insert operation has to check whether the dictionary needed to be resized, and the complexity that the check adds to the insert operation slows things down.When a dictionary needs to be resized, how should the new size be determined? For small- or medium-size dictionaries with 50,000 keys or fewer, the new size is
ma_used*4. Most Python programs that work with large dictionaries build up the dictionary in an initial phase of processing, and then look up individual keys or loop over the entire contents. Quadrupling the dictionary size like this keeps the dictionary sparse (the fill ratio starts out at 1/4) and reduces the number of resize operations performed during the build phase. Large dictionaries with more than 50,000 keys usema_used*2to avoid consuming too much memory for empty slots.On deleting a key from a dictionary, the slot occupied by the key is changed to point to a dummy key, and thema_usedcount is updated, but the number of full slots in the table isn't checked. This means dictionaries are never resized on deletion. If you build a large dictionary and then delete many keys from it, the dictionary's hash table may be larger than if you'd constructed the smaller dictionary directly. This usage pattern is quite infrequent, though. Keys are almost never deleted from the many small dictionaries used for objects and for passing function arguments. Many Python programs will build a dictionary, work with it for a while, and then discard the whole dictionary. Therefore, very few Python programs will encounter high memory usage because of the no-resize-on-deletion policy.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Iterations and Dynamic Changes
- InhaltsvorschauA common use case is looping through the contents of a dictionary. The
keys(), values(), anditems()methods return lists containing all of the keys, values, or key/value pairs in the dictionary. To conserve memory, the user can call theiterkeys(), itervalues(), anditeritems()methods instead; they return an iterator object that returns elements one by one. But when these iterators are used, Python has to forbid any statement that adds or deletes an entry in the dictionary during the loop.This restriction turns out to be fairly easy to enforce. The iterator records the number of items in the dictionary when aniter*()method is first called. If the size changes, the iterator raises aRuntimeErrorexception with the messagedictionary changed size during iteration.One special case that modifies a dictionary while looping over it is code that assigns a new value for the same key:for k, v in d.iteritems(): d[k] = d[k] + 1
It's convenient to avoid raising aRuntimeErrorexception during such operations. Therefore, the C function that handles dictionary insertion,PyDict_SetItem(), guarantees not to resize the dictionary if it inserts a key that's already present. Thelookdict()andlookdict_ stringsearch functions support this feature by the way they report failure (not finding the searched-for key): on failure, they return a pointer to the empty slot where the searched-for key would have been stored. This makes it easy forPyDict_SetItemto store the new value in the returned slot, which is either an empty slot or a slot known to be occupied by the same key. When the new value is recorded in a slot already occupied by the same key, as ind[k]=d[k]+1, the dictionary's size isn't checked for a possible resize operation, and theRuntimeErroris avoided. Code such as the previous example therefore runs without an exception.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Conclusion
- InhaltsvorschauDespite the many features and options presented by Python dictionaries, and their widespread use internally, the CPython implementation is still mostly straightforward. The optimizations that have been done are largely algorithmic, and their effects on collision rates and on benchmarks have been tested experimentally where possible. To learn more about the dictionary implementation, the source code is your best guide. First, read the Objects/dictnotes.txt file at http://svn.python.org/view/python/trunk/Objects/dictnotes.txt?view=markup for a discussion of the common use cases for dictionaries and of various possible optimizations. (Not all the approaches described in the file are used in the current code.) Next, read the Objects/dictobject.c source file at http://svn.python.org/view/python/trunk/Objects/dictobject.c?view=markup.You can get a good understanding of the issues by reading the comments and taking an occasional clarifying glance at the code.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Acknowledgments
- InhaltsvorschauThanks to Raymond Hettinger for his comments on this chapter. Any errors are my own.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 19: Multidimensional Iterators in NumPy
- InhaltsvorschauTravis E. OliphantNumpy is an optional package for the python language that provides a powerful N-dimensional array object. An N-dimensional array is a data structure that uses N integers, or indices, to access individual elements. It is a useful model for a wide variety of data processed by a computer.For example, a one-dimensional array can store the samples of a sound wave, a two-dimensional array can store a grayscale image, a three-dimensional array can store a color image (with one of the dimensions having a length of 3 or 4), and a four-dimensional array can store the value of pressure in a room during a concert. Even higher-dimensional arrays are often useful.NumPy provides an environment for the mathematical and structural manipulation of arrays of arbitrary dimensions. These manipulations are at the heart of much scientific, engineering, and multimedia code that routinely deals with large amounts of data. Being able to perform these mathematical and structural manipulations in a high-level language can considerably simplify the development and later reuse of these algorithms.NumPy provides an assortment of mathematical calculations that can be done on arrays, as well as providing very simple syntax for structural operations. As a result, Python (with NumPy) can be successfully used for the development of significant and fast-performing engineering and scientific code.One feature that allows fast structural manipulation is that any subregion of a NumPy array can be selected using the concept of slicing. In Python, a slice is defined by a starting index, a stopping index, and a stride, using the notation
start:stop:stride(inside square brackets).For example, suppose we want to crop and shrink a 656x498 image to a 160x120 image by selecting a region in the middle of the image. If the image is held in the NumPy array,im, this operation can be performed using:im2=im[8:-8:4, 9:-9:4]
An important feature of NumPy, however, is that the new image selected in this way actually shares data with the underlying image. A copy is not performed. This can be an important optimization when calculating with large data sets where indiscriminate copying can overwhelm the computing resources.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Key Challenges in N-Dimensional Array Operations
- InhaltsvorschauIn order to provide fast implementations of all mathematical operations, NumPy implements loops (in C) that work quickly over an array or several arrays of any number of dimensions. Writing such generic code that works quickly on arrays of arbitrary dimension can be a mind-stretching task. It may be easy to write a
forloop to process all the elements of a one-dimensional array, or two nestedforloops to process all the elements of a two-dimensional array. Indeed, if you know ahead of time how many dimensions the array consists of, you can use the right number offorloops to directly loop over all of the elements of the array. But how do you write a generalforloop that will process all of the elements of an N-dimensional array when N can be an arbitrary integer?There are two basic solutions to this problem. One solution is to use recursion by thinking about the problem in terms of a recursive case and a base case. Thus, ifcopy_ND(a, b, N)is a function that copies anN-dimensional array pointed to bybto another N-dimensional array pointed to bya, a simple recursive implementation might look like:if (N==0) copy memory from b to a return set up ptr_to_a and ptr_to_b for n=0 to size of first dimension of a and b copy_ND(ptr_to_a, ptr_to_b, N-1) add stride_a[0] to ptr_to_a and stride_b[0] to ptr_b
Notice the use of the singleforloop and the check for the base case that stops the recursion when N reaches 0.It is not always easy to think about how to write every algorithm as a recursive algorithm, even though the code just shown can often be used as a starting model. Recursion also requires the use of a function call at each iteration of the loop. So it can be all too easy for recursion to create slow code, unless some base-case optimization is performed (such as stopping when N==1 and doing the memory copy in aforloop locally).Most languages will not do that kind of optimization automatically, so an elegant-looking recursive solution might end up looking much more contrived by the time optimizations are added.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Memory Models for an N-Dimensional Array
- InhaltsvorschauThe simplest model for an N-dimensional array in computer memory can be used whenever all of the elements of the array are sitting next to each other in a contiguous segment. Under such circumstances, getting to the next element of the array is as simple as adding a fixed constant to a pointer to the memory location of the current data pointer. As a result, an iterator for contiguous memory arrays requires just adding a fixed constant to the current data pointer. Therefore, if every N-dimensional array in NumPy were contiguous, discussing iterators would be rather uninteresting.The beauty of the iterator abstraction is that it allows us to think about processing and manipulating noncontiguous arrays with the same ease as contiguous arrays. Noncontiguous arrays arise in NumPy because an array can be created that is a "view" of some other contiguous memory area. This new array may not itself be contiguous.For example, consider a three-dimensional array,
a, that is contiguous in memory. With NumPy, you can create another array consisting of a subset of this larger array using Python's slicing notation. Thus, the statementb=a[::2, 3:, 1::3]returns another NumPy array consisting of every other element in the first dimension, all elements starting at the fourth element (with zero-based indexing) in the second dimension, and every third element starting at the second element in the third dimension. This new array is not a copy of the memory at those locations; it is a view of the original array and shares memory with it. But this new array cannot be represented as a contiguous chunk of memory.A two-dimensional illustration should further drive home the point. shows a contiguous, two-dimensional, 4 x 5 array with memory locations labeled from 1 through 20. Above the representation of the 4 x 5 array is a linear representation of the memory for the array as the computer might see it. IfEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - NumPy Iterator Origins
- InhaltsvorschauLoops that traverse all the elements of an array in compiled code are an essential feature that NumPy offers the Python programmer. The use of an iterator makes it relatively easy to write these loops in a straightforward and readable way that works for the most general (arbitrarily strided) arrays supported by NumPy. The iterator abstraction is an example in my mind of beautiful code because it allows simple expression of a simple idea, even though the underlying implementation details might actually be complicated. This kind of beautiful code does not just drop into existence, but it is often the result of repeated attempts to solve a set of similar problems until a general solution crystallizes.My first attempt at writing a general-purpose, N-dimensional looping construct occurred around 1997 when I was trying to write code for both an N-dimensional convolution and a general-purpose arraymap that would perform a Python function on every element of an N-dimensional array.The solution I arrived at then (though not formalized as an iterator) was to keep track of a C array of integers as indices for the N-dimensional array. Iteration meant incrementing this N-index counter with special code to wrap the counter back to zero and increment the next counter by 1 when the index reached the size of the array in a particular dimension.While writing NumPy eight years later, I became more aware of the concept of, and use of, iterator objects in Python. I thus considered adapting the arraymap code as a formal iterator. In the process, I studied how Peter Vevreer (author of SciPy's ndimage package) accomplished N-dimensional looping and discovered an iterator very similar to what I had already been using. With this boost in confidence, I formalized the iterator, applying ideas from ndimage to the basic structural elements contained in the arraymap and N-dimensional convolution code.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Iterator Design
- InhaltsvorschauAs described previously, an iterator is an abstract concept that encapsulates the idea of walking through each element of an array. The basic pseudocode for an iterator-based loop used in NumPy is:
set up iterator (including pointing the current value to the first value in the array) while iterator not done:process the current value point the current value to the next valueEverything but process the current value must be handled by the iterator and deserves discussion. As a result, there are basically three parts to the iterator design:-
Moving to the next value
-
Termination
-
Setup
These will each be discussed separately. The design considerations that went into NumPy's iterators included making the overhead for using them inside of a loop as small as possible and making them as fast as possible.The first decision is the order in which the elements will be taken. Although one could conceive of an iterator with no guarantee of the order in which the elements are taken, it is useful most of the time for the programmer to know the order. As a result, iterators in NumPy follow a specific order. The order is obtained using a relatively simple approach patterned after simple counting (with wrap-around) using a tuple of digits. Let a tuple of N integers represent the current position in the array, with (0,…,0) representing the first element of the n1 x n2 x … x nN array, and (n1−1, n2−1,…, nN−1) representing the last element.This tuple of integers represents an N-digit counter. The next position is found by incrementing the last digit by one. If, during this process, the ith digit reaches ni, it is set to 0, and the (i−1)th digit is incremented by 1.For example, the counting for a 3 x 2 x 4 array would proceed as follows:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Iterator Interface
- InhaltsvorschauThe iterator is implemented in NumPy using a combination of macros and function calls. An iterator is created using the C-API function call
it=PyArray_IterNew(ao). The check for iterator termination can be accomplished using the macroPyArray_ITER_NOTDONE(it). Finally, the next position in the iterator is accomplished usingPyArray_ITER_NEXT(it), which is a macro to ensure that it occurs inline (avoiding the function call). Ideally, this macro would be an inline function because it is sufficiently complicated. However, because NumPy is written to ANSI C, which does not define inline functions, a macro is used. Finally, the pointer to the first byte of the current value can be obtained usingPyArray_ITER_DATA(it), which avoids referencing the structure memberdataptrdirectly (allowing for future name changes to the structure members).An example of the iterator interface is the following code snippet, which computes the largest value in an N-dimensional array. We assume that the array is namedao, has elements of typedouble, and is correctly aligned:#include <float.h> double *currval, maxval=-DBL_MAX; PyObject *it; it = PyArray_IterNew(ao); while (PyArray_ITER_NOTDONE(it)) { currval = (double *)PyArray_ITER_DATA(it); if (*currval < maxval) maxval = *currval; PyArray_ITER_NEXT(it); }This code shows how relatively easy it is to construct a loop for a noncontiguous, N-dimensional array using the iterator structure. The simplicity of this code also illustrates the elegance of iterator abstraction. Notice how similar the code is to the simple iterator pseudocode shown at the beginning of the earlier section "Iterator Design." Consider also that this code works for arrays of arbitrary dimensions and arbitrary strides in each dimension, and you begin to appreciate the beauty of the multidimensional iterator.The iterator-based code is fast for both contiguous and noncontiguous arrays. However, the fastest contiguous-array loop is still something like:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Iterator Use
- InhaltsvorschauA good abstraction proves its worth when it makes coding simpler under diverse conditions, or when it ends up being useful in ways that were originally unintended. Both of these affirmations of value are definitely true of the NumPy iterator object. With only slight modifications, the original simple NumPy iterator has become a workhorse for implementing other NumPy features, such as iteration over all but one dimension and iteration over multiple arrays at the same time. In addition, when we had to quickly addsome enhancements to the code for generating random numbers and for broadcast-based copying, the existence of the iterator and its extensions made implementation much easier.A common motif in NumPy is to gain speed by concentrating optimizations on the loop over a single dimension where simple striding can be assumed. Then an iteration strategy that iterates over all but the last dimension is used. This was the approach introduced by NumPy's predecessor, Numeric, to implement the math functionality.In NumPy, a slight modification to the NumPy iterator makes it possible to use this basic strategy in any code. The modified iterator is returned from the constructor as follows:
it = PyArray_IterAllButAxis(array, &dim).
ThePyArray_IterAllButAxisfunction takes a NumPy array and the address of an integer representing the dimension to remove from the iteration. The integer is passed by reference (the&operator) because if the dimension is specified as −1, the function determines which dimension to remove from iteration and places the number of that dimension in the argument. When the input dimension is −1, the routine chooses the dimension with the smallest nonzero stride.Another choice for the dimension to remove might have been the dimension with the largest number of elements. That choice would minimize the number of outer loop iterations and reserve the most elements for the presumably fast inner loop. The problem with that choice is that getting information in and out of memory is often the slowest part of an algorithm on general-purpose processors.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Conclusion
- InhaltsvorschauThe iterator object in NumPy is an example of a coding abstraction that simplifies programming. Since its construction in 2005, it has been extremely useful in writing N-dimensional algorithms that work on general NumPy arrays regardless of whether or not they are contiguous in memory or actually represent noncontiguous N-dimensional slices of some other contiguous chunk of memory. In addition, simple modifications to the iterator have made it much simpler to implement some of the more difficult ideas of NumPy, such as optimizing loops (looping over all but the least-striding dimension) and broadcasting.Iterators are a beautiful abstraction because they save valuable programmer attention in the implementation of a complicated algorithm. This has been true in NumPy as well. The NumPy implementation of a standard array iterator has made general-purpose code much more pleasant to write and debug, and it has allowed the encapsulation and exposure of some of the important (but hard-to-write) internal features of NumPy, such as broadcasting.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 20: A Highly Reliable Enterprise System for NASA's Mars Rover Mission
- InhaltsvorschauRonald MakHow often do you hear that beauty is in the eye of the beholder? In our case, the beholder was NASA's Mars Exploration Rover mission, and it had very strict requirements that the mission's software systems be functional, reliable, and robust. Oh, and the software also had to be completed on schedule—Mars would not accept any excuses for schedule slips. When NASA talks about meeting "launch windows," it means it in more ways than one!This chapter describes the design and development of Collaborative Information Portal, or CIP, which is a large enterprise information system developed at NASA and used by mission managers, engineers, and scientists worldwide.Martians have zero tolerance for ugly software. For CIP, the notion of beauty is not so much about elegant algorithms or programs that you can stand back and admire. Rather, beauty is embodied in a complex software structure built by master builders who knew just where to pound in the nails. Large applications can be beautiful in ways that small programs often are not. This is due both to increased necessity and to greater opportunity—large applications often have to do things that small programs don't need to. We'll take a look at CIP's overall Java-based service-oriented architecture, and then, by focusing on one of its services as a case study, examine some code snippets and study some of the nails that enable the system to meet the functionality, reliability, and robustness requirements.As you can imagine, software used on NASA space missions must be highly reliable. Missions are expensive, and years of planning and many millions of dollars cannot be jeopardized by faulty programs. The most difficult part of the software work, of course, is to debug and patch software used onboard a spacecraft that is millions of miles from Earth. But even ground-based systems must be reliable; nobody wants a software bug to interrupt mission operations or cause the loss of valuable data.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- The Mission and the Collaborative Information Portal
- InhaltsvorschauThe primary goal of the Mars Exploration Rover, or MER, mission is to discover whether liquid water once flowed on the Martian surface. In June and July 2003, NASA launched two identical rovers to Mars to operate as robotic geologists. In January 2004, after separate seven-month journeys, they landed on opposite sides of the planet.Each rover is solar-powered and can drive itself over the surface. Each one has scientific instruments such as spectrometers mounted at the end of an articulated arm. The arm has a drill and a microscopic imager to examine what's beneath the surface of rocks. Each rover has several cameras and antennas to send data and images back to earth (see ).Unmanned NASA missions are a combination of hardware and software. Different software packages onboard each Mars rover control its operation autonomously and in response to remote commands issued from mission control at NASA's Jet Propulsion Laboratory (JPL) near Pasadena, California. Earth-based software packages at mission control enable the mission managers, engineers, and scientists to download and analyze the information sent by the rovers, to plan and develop new command sequences to send to the rovers, and to collaborate with each other.At the NASA Ames Research Center near Mountain View, California, we designed and developed the Collaborative Information Portal (CIP) for the MER mission. The project team consisted of 10 software engineers and computer scientists. Another nine team members included project managers and the support engineers who took care of QA, software system build, hardware configuration, and bug-tracking tasks.
Figure 20-1: A Mars rover (image courtesy of JPL)Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Mission Needs
- InhaltsvorschauWe designed CIP to meet three primary needs of the MER mission. By satisfying these needs, CIP provides vital "situational awareness" among mission personnel:
- Time management
-
Keeping everybody synchronized during any large complex mission is critical for success, and MER presented some special time management challenges. Because mission personnel work at locations around the world, CIP displays time in various terrestrial time zones. Because the rovers landed on opposite sides of Mars, there are also two Martian time zones.Initially, the mission ran on Mars time, which meant that all scheduled meetings and events (such as data downloads from Mars) were given in the time of one Mars time zone or the other, depending on to which rover the meeting or event pertained. A Martian day is nearly 40 minutes longer than an Earth day, and so relative to Earth time, mission personnel shifted later by that amount of time each day as seen by their families and colleagues who remained on Earth time. This made CIP's time management functions even more important.
- Personnel management
-
With two rover teams (one per rover) and some people moving between the teams, keeping track of everybody is another critical function. CIP manages a personnel roster and displays schedules (as Gantt charts) that show who is working where, when, and in what role.CIP also enables some collaboration among mission personnel. They can send out broadcast messages, share analyses of data and image, upload reports, and annotate each other's reports.
- Data management
-
Getting data and images from the far reaches of the universe is the key to every NASA planetary and deep space mission, and here, too, CIP plays a major role. An array of terrestrial antennas receive the data and images sent by the Mars rovers, and once on Earth, they are transmitted to JPL to be processed and stored in the mission file servers.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - System Architecture
- InhaltsvorschauCode beauty for an enterprise system is derived partly from architecture, the way the code is put together. Architecture is more than aesthetics. In a large application, architecture determines how the software components interoperate and contributes to overall system reliability.We implemented CIP using a three-tiered service-oriented architecture (SOA). We adhered to industry standards and best practices, and where practicable, we used commercial off-the-shelf (COTS) software. We programmed mostly in Java and used the Java 2 Enterprise Edition (J2EE) standards (see ).The client tier consists mostly of standalone GUI-based Java applications implemented with Swing components and a few web-based applications. A J2EE-compliant application server runs in the middleware and hosts all the services that respond to requests from the client applications. We implemented the services using Enterprise JavaBeans (EJBs). The data tier consists of the data sources and data utility programs. Also written in Java, these utilities monitor the file server for the processed data and image files. The utilities generate metadata for the files as soon as they're released by the mission managers.
Figure 20-2: CIP's three-tiered service-oriented architectureUsing an SOA based on J2EE gave us the option to use these well-defined beans (and others) wherever appropriate in the design of a large enterprise application. Stateless session beans handle service requests without remembering any state from one request to the next. On the other hand, stateful session beans maintain state information for clients and typically manage persisted information that the beans read from and write to a data store. Having multiple options in any design situation is important when developing large complex applications.In the middleware, we implemented a stateless session bean as a service provider for each service. This was the facade from which we generated a web service that the client applications use to make service requests and get back responses. Each service may also access one or more stateful session beans that are business objects supplying any necessary logic that must maintain state between service requests, such as where to read the next block of information from a database in response to a data request. In effect, the stateless beans are often the service dispatchers for the stateful beans that do the actual work.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Case Study: The Streamer Service
- InhaltsvorschauYou've seen some of the beauty of CIP at the architectural level. It's time to focus on one of its middleware services—the streamer service—as a case study, and examine some of the nails that allowed us to meet the mission's strict functional, reliability, and robustness requirements. You'll see that the nails were not particularly fancy; the beauty was in knowing just where to pound them in.One of the MER mission's data management needs is to allow users to download data and image files from the mission file servers located at JPL to their personal workstations and laptops. As described earlier, CIP data-tier utilities generate metadata that allow users to find the files they want based on various search criteria. Users also need to upload files that contain their analysis reports to the servers.CIP's streamer service performs file downloading and uploading. We gave the service that name because it streams the file data securely across the Internet between the mission file servers at JPL and users' local computers. It uses the web services protocol, so client applications can be written in any language that supports the protocol, and these applications are free to devise whatever GUI they deem suitable.Like each of the other middleware services, the streamer service uses web services to receive client requests and to return responses. Each request is first fielded by the streamer service provider, which is implemented by a stateless session bean. The service provider creates a file reader, implemented by a stateful session bean, to do the actual work of downloading the requested file contents to the client. Conversely, the service provider creates a file writer, also implemented by a stateful session bean, to upload file contents (see ).
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Reliability
- InhaltsvorschauReliable code continues to perform well without problems. It rarely crashes, if ever. As you can imagine, code that is on board the Mars rovers must be extremely reliable because making on-site maintenance calls is somewhat difficult. But the MER mission wanted earthbound software used by mission control to be reliable, too. Once the mission was underway, no one wanted software problems to disrupt operations.As noted earlier, the CIP project took several measures to ensure the intrinsic reliability of the system:
-
Adhering to industry standards and best practices, including J2EE
-
Using proven COTS software wherever practicable, including a commercial application server from an established middleware vendor
-
Using a service-oriented architecture with modular services
-
Implementing simple, straightforward middleware services
We further enhanced reliability with extra nails: service logging and monitoring. While these features can be useful for debugging even small programs, they become essential for keeping track of the runtime behavior of large applications.During development, we used the open source Apache Log4J Java package to log nearly everything that occurred in the middleware services. It was certainly useful for debugging during development. Logging enabled us to write code that was more reliable. Whenever there was a bug, the logs told us what was going on just prior to the problem, and so we were better able to fix the bug.We originally intended to reduce the logging only to serious messages before CIP went into operation. But we ended up leaving most of the logging on since it had a negligible impact on overall performance. Then we discovered that the logs gave us much useful information not only about what was going on with each service, but alsoEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Robustness
- InhaltsvorschauChange is inevitable, and beautiful code can handle change gracefully even after going into operation. We took a couple of measures to ensure that CIP is robust and can deal with changes in operational parameters:
-
We avoided hardcoding parameters in the middleware services.
-
We made it possible to make changes to the middleware services that are already in operation with minimal interruption to the client applications.
Most of the middleware services have certain key operational parameters. For example, as seen above, the streamer service downloads file contents in blocks, and so it has a block size. Instead of hardcoding the block size, we put the value in a parameter file that the service reads each time it first starts up. This happens whenever the streamer service provider bean is loaded (lines 3–5 in the code under the section "Logging").A middleware.properties file, which all the middleware services share and load, contains the line:middleware.streamer.blocksize = 65536
The file reader bean'sreadDataBlock( )method can then refer to the value (line 183).Each middleware service can load several parameter values at startup. One of the skills of a master software builder is knowing which key values of a service to expose as loadable parameters. They are certainly helpful during development; for instance, we were able to try different block sizes during development without having to recompile the streamer service each time.But loadable parameters are even more critical for putting code into operation. In most production environments, it is difficult and expensive to make changes to software that is already in operation. This was certainly true for the MER mission, which had a formal Change Control Board that scrutinized the justifications for making code changes once the mission was under way.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Conclusion
- InhaltsvorschauThe Collaborative Information Portal proves that it is possible—yes, even at a huge government agency like NASA—to develop a large complex enterprise software system on time that successfully meets strict requirements for functionality, reliability, and robustness. The Mars rovers have far exceeded expectations, a testament of how well the hardware and the software, both on Mars and on Earth, were designed and built, and of the skills of the builders.Unlike smaller programs, beauty for a large application is not necessarily found only in elegant algorithms. For CIP, beauty is in its implementation of a service-oriented architecture and in the numerous simple but well-chosen components—the nails that master software builders know just where to pound in.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 21: ERP5: Designing for Maximum Adaptability
- InhaltsvorschauRogerio Atem de Carvalho and Rafael MonneratEnterprise resource planning systems are generally known as large, proprietary, and highly expensive products. In 2001, work on an open source ERP system known as ERP5 (http://www.erp5.com) began in two French companies, Nexedi (its main developer) and Coramy (its first user). ERP5 is named after the five main concepts that compose its core. It is based on the Zope project and the Python scripting language, both of which are also open source.We have found that ERP5 is exceptionally easy to enhance, and easy for both developers and users to build on. One reason is that we adopted an innovative document-centric approach, instead of a process- or data-centric paradigm. The core idea of a document-centric paradigm is that every business process relies on a series of documents to make it happen. The document's fields correspond to the structure of the process—that is, the fields reflect the data and the relationships among this data. Thus, if you watch how the business experts who use the ERP5 system navigate through the documents, you will discover the process workflow.Zope's Content Management Framework (CMF) tools and concepts supply the technology behind this approach. Each instance of the CMF, called a portal, contains objects to which it offers services such as viewing, printing, workflow, and storage. A document structure is implemented as a Python portal class, and its behavior is implemented as a portal workflow. Therefore, users interact with web documents that are in fact views of system objects controlled by proper workflows.This chapter will show how this document-centric paradigm and a unified set of core concepts make ERP5 a highly flexible ERP. We will illustrate these ideas by explaining how we used rapid development techniques to create ERP5's project management module, Project.ERP is software that aims to integrate all the data and processes of an organization into a unique system. Since this is a real challenge, the ERP industry offers different versions of the same ERP software for different economic segments, such as oil and gas, mechanical, pharmaceutical, automobile, and government.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- General Goals of ERP
- InhaltsvorschauERP is software that aims to integrate all the data and processes of an organization into a unique system. Since this is a real challenge, the ERP industry offers different versions of the same ERP software for different economic segments, such as oil and gas, mechanical, pharmaceutical, automobile, and government.ERP software generally consists of a series of modules that automate the operations of the organization. The most common modules include finance, inventory control, payroll, production planning and control, sales, and accounting. Those modules are designed for customization and adaptation at the user's site, because even though the organizations of a given economic segment share certain common practices, every organization wants to adapt the ERP system to its specific needs. ERP software also evolves quickly to accompany the evolution of the businesses it serves, and more and more modules are added to it over time.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- ERP5
- InhaltsvorschauERP5 is developed and used by a growing business and academic community in France, Brazil, Germany, Luxembourg, Poland, Senegal, and India, among other countries. It offers an integrated business management solution based on the open source Zope plat-form (http://www.zope.org), written in the Python language (http://www.python.org). Among the key components of Zope used by ERP5 are:
- ZODB
-
An object database
- DCWorkflow
-
A workflow engine
- Content Management Framework (CMF)
-
An infrastructure for adding and moving content
- Zope Page Templates (ZPT)
-
Rapid GUI scripting based on XML
In addition, ERP5 heavily relies on XML technologies. Every object can be exported and imported in XML format, and two or more ERP5 sites can share synchronized objects through the SyncML protocol. ERP5 also implements an object-to-relational mapping scheme that stores the indexing attributes of each object in a relational database and allows much faster object search and retrieval than ZODB. In that way, objects are kept in ZODB, but searches are made using SQL, which is a standard query language.ERP5 was conceived to be a very flexible framework for developing enterprise applications. Being flexible means being adaptable to various business models without incurring high costs for changes and maintenance. To accomplish this, it is necessary to define a core object-oriented model from which new components can be easily derived for specific purposes. This model must be abstract enough to embrace all basic business concepts.As the name indicates, ERP5 therefore defines five abstract concepts that lay the basis for representing business processes:- Resource
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - The Underlying Zope Platform
- InhaltsvorschauTo understand why ERP5 is said to be document-driven, it is necessary first to understand how Zope and its Content Management Framework (CMF) work. Zope was originally developed as a web content management environment that provides a series of services to manage the life cycle of web documents. With time, people started to note that it can be also used to implement any kind of web-based application.In keeping with Zope's web content focus, its CMF is a framework that aims to speed the development of applications based on content types. It provides a series of services associated to these types, such as workflow, searching, security, design, and testing. From Zope, CMF inherits access to the ZODB (Zope Object Database), which provides transactions and undo functionality.CMF implements the structural part of applications through CMF Types, maintained by the
portal_typesservice, which is in turn a kind of registry tool for the recognized types of a given portal. The visible part of a portal type is a document that represents it. To implement behavior, portal types have actions associated with them, composing a workflow, which in turn is the implementation of a business process. An action on a document changes its state and is implemented by a Python script that realizes some business logic; for instance, calculating the total cost of an order. Given this framework, when developing an application in ERP5, we have to think in terms of documents that hold business process data and whose life cycles are kept by workflows, which implement the business process behavior.To take advantage of the CMF structure, ERP5 code is divided into a four-level architecture that implements a chain of concept transformations, with configuration tasks at the highest level.The first level comprises the five conceptual core classes. They have no code, only a skeleton, for the sake of simple documentation:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - ERP5 Project Concepts
- InhaltsvorschauTo exemplify how ERP5 modules are coded, we'll spend most of the rest of this chapter exploring ERP5 Project, a flexible project management tool that can be used in many ways.Due to the fast and competitive global business environment, projects are the usual form through which businesses develop innovative products and services. Therefore, project management is gaining interest in every industry segment.But what is a project? According to Wikipedia, a project is "a temporary endeavor under-taken to create a unique product or service" (http://en.wikipedia.org/wiki/Project, last visited April 13, 2007).The uniqueness of projects makes their management difficult, even for a small project sometimes. Hence the need for project management, which is "the discipline of organizing and managing resources in such a way that these resources deliver all the work required to complete a project within defined scope, quality, time, and cost constraints" (http://en.wikipedia.org/wiki/Project_management, last visited April 13, 2007).Project management therefore must control a series of data related to resources such as money, time, and people to keep everything going as planned. Because of this, information tools are needed to ease the analysis of large amounts of data.The first use for ERP5 Project was as an "internal" project management tool to support ERP5 instance creation projects. Afterwards, it was redesigned to support other types of projects in general. Even more broadly, this tool can manage order planning and execution wherever a project viewpoint can aid production planning and control. In other words, ERP5 Project should be adaptable to every situation where it is interesting to think in terms of a project composed of a series of tasks and limited by a series of constraints.ERP5 allows the developer to reuse the current packages in delivering other packages as completely new modules. Following this concept, a new business template (BT) is created by basing it on an existing one.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Coding the ERP5 Project
- InhaltsvorschauThe first thing we thought about in creating the Project BT was the main project class. Initially, instead of creating a new class, we decided to simply use Order itself, without change. But after some time, we realized that the business definitions for an order and a project are so different that we should simply create Project as a subclass of Order, without any new code in it, just for the sake of separating concerns. In this design, a project is an object that is described by a series of goals or milestones with one or more tasks associated with each of them.Then, we had to decide how to implement task management since there are differences between a project and a trade operation. The first thing to consider is that tasks can occur outside projects—for instance, in production planning. Therefore, we consider a task as a composition of task lines, or smaller activities inside a task. In that way, we decouple tasks from projects, allowing them to be used in other situations, while keeping the relation with Project through
source_projectanddestination_projectbase categories.Tasks are implemented through configuration, as we did for Task Report Line in , with the difference of using Order as a metaclass. The creation of a task associated with a project is shown here:# Add a task in task_module. Context represents the currentproject object. context_obj = context.getObject() # newContent is an ERP5 API method that creates a new content. task = context.task_module.newContent(portal_type = 'Task') # Set the source_project reference to the task. task.setSourceProjectValue(context_obj) # Redirect the user to Task GUI, so that the user can edit its properties. return context.REQUEST.RESPONSE.redirect(task.absolute_url() + '?portal_status_ message=Created+Task.')
Remember that for retrieving the tasks of a certain project, the programmer needs to use only the base categorysource_project. With this category, ERP5 RAD automatically generates signatures and algorithms of accessors. It is interesting to note that the same accessors will be created for both Task and Project. The programmer decides which ones to use through configuration, using the Actions tab shown in . In that tab, the programmer can define a new GUI for using the following methods:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Conclusion
- InhaltsvorschauThe ERP5 team was able to implement a highly flexible tool, used for both "traditional" project management and for order planning and execution control, by making substantial reuse of already existing core concepts and code. Actually, reuse is a daily operation in ERP5 development, to the point where entire new modules are created just by changing GUI elements and adjusting workflows.Because of this emphasis on reuse, queries on the object database can be done at the abstraction levels of portal types or meta classes. In the first case, the specific business domain concept is retrieved, such as a project task. In the second case, all objects related to the UBM generic concepts are retrieved, which is quite interesting for such requirements as statistics gathering.In this chapter, we have edited some code snippets to make them more readable. All ERP5 code in its raw state is available at http://svn.erp5.org/erp5/trunk.We would like to thank Jean-Paul Smets-Solanes, ERP5 creator and chief architect, and all the guys on the team, especially Romain Courteaud and Thierry Faucher. When the authors say we during the discussion of ERP5 design and implementation, they are refer-ring to all those nice folks at Nexedi.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 22: A Spoonful of Sewage
- InhaltsvorschauBryanCantrillIf you put a spoonful of sewage in a barrel full of wine, you get sewage.Schopenhauer's Law of EntropyUnlike most things that we engineer, software has a binary notion of correctness: either it is correct, or it is flawed. That is, unlike a bridge or an airplane or a microprocessor, software doesn't have physical parameters that limit the scope of its correctness; software doesn't have a rated load or a maximum speed or an environmental envelope. In this regard, software is much more like mathematical proof than physical machine: a proof's elegance or inelegance is subjective, but its correctness is not.And indeed, this lends a purity to software that has traditionally only been enjoyed by mathematics: software, like mathematics, can be correct in an absolute and timeless sense. But if this purity of software is its Dr. Jekyll, software has a brittleness that is its Mr. Hyde: given that software can only be correct or flawed, a single flaw can become the difference between unqualified success and abject failure.Of course, this is not to say that every bug is necessarily fatal—just that the possibility always exists that a single bug will reveal something much larger than its manifestations: a design flaw that calls into question the fundamental assumptions upon which the software was built. Such a flaw can shake software to its core and, in the worst case, invalidate it completely. That is, a single software bug can be the proverbial spoonful of sewage in the barrel of wine, turning what would otherwise be an enjoyable pleasure into a toxic stew.For me personally, this fine line between wine and sewage was never so stark as in one incident in the development of a critical subsystem of the Solaris kernel in 1999. This problem—and its solution—are worth discussing at some length, for they reveal how profound design defects can manifest themselves as bugs, and how devilish the details can become when getting a complicated and important body of software to function perfectly.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 23: Distributed Programming with MapReduce
- InhaltsvorschauJeffrey Dean and Sanjay GhemawatThis chapter describes the design and implementation of mapreduce, a programming system for large-scale data processing problems. MapReduce was developed as a way of simplifying the development of large-scale computations at Google. MapReduce programs are automatically parallelized and executed on a large cluster of commodity machines. The runtime system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required intermachine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system.Suppose that you have 20 billion documents, and you want to generate a count of how often each unique word occurs in the documents. With an average document size of 20 KB, just reading through the 400 terabytes of data on one machine will take roughly four months. Assuming we were willing to wait that long and that we had a machine with sufficient memory, the code would be relatively simple. (all the examples in this chapter are pseudocode) shows a possible algorithm.Example 23-1. Naïve, nonparallel word count program
map<string, int> word_count; for each document d { for each word w in d { word_count[w]++; } } ... save word_count to persistent storage ...One way of speeding up this computation is to perform the same computation in parallel across each individual document, as shown in .Example 23-2. Parallelized word count programMutex lock; // Protects word_count map<string, int> word_count; for each document d in parallel { for each word w in d { lock.Lock(); word_count[w]++; lock.Unlock(); } } ... save word_count to persistent storage ...The preceding code nicely parallelizes the input side of the problem. In reality, the code to start up threads would be a bit more complex, since we've hidden a bunch of details by using pseudocode. One problem with is that it uses a single global data structure for keeping track of the generated counts. As a result, there is likely to be significant lock contention with theEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - A Motivating Example
- InhaltsvorschauSuppose that you have 20 billion documents, and you want to generate a count of how often each unique word occurs in the documents. With an average document size of 20 KB, just reading through the 400 terabytes of data on one machine will take roughly four months. Assuming we were willing to wait that long and that we had a machine with sufficient memory, the code would be relatively simple. (all the examples in this chapter are pseudocode) shows a possible algorithm.Example 23-1. Naïve, nonparallel word count program
map<string, int> word_count; for each document d { for each word w in d { word_count[w]++; } } ... save word_count to persistent storage ...One way of speeding up this computation is to perform the same computation in parallel across each individual document, as shown in .Example 23-2. Parallelized word count programMutex lock; // Protects word_count map<string, int> word_count; for each document d in parallel { for each word w in d { lock.Lock(); word_count[w]++; lock.Unlock(); } } ... save word_count to persistent storage ...The preceding code nicely parallelizes the input side of the problem. In reality, the code to start up threads would be a bit more complex, since we've hidden a bunch of details by using pseudocode. One problem with is that it uses a single global data structure for keeping track of the generated counts. As a result, there is likely to be significant lock contention with theword_countdata structure as the bottleneck. This problem can be fixed by partitioning theword_countdata structure into a number of buckets with a separate lock per bucket, as shown in .Example 23-3. Parallelized word count program with partitioned storagestruct CountTable { Mutex lock; map<string, int> word_count; }; const int kNumBuckets = 256; CountTable tables[kNumBuckets]; for each document d in parallel { for each word w in d { int bucket = hash(w) % kNumBuckets; tables[bucket].lock.Lock(); tables[bucket].word_count[w]++; tables[bucket].lock.Unlock(); } } for (int b = 0; b < kNumBuckets; b++) { ... save tables[b].word_count to persistent storage ... }Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - The MapReduce Programming Model
- InhaltsvorschauIf you compare with , you'll find that the simple task of counting words has been buried under lots of details about managing parallelism. If we can somehow separate the details of the original problem from the details of parallelization, we may be able to produce a general parallelization library or system that can be applied not just to this word-counting problem, but other large-scale processing problems. The parallelization pattern that we are using is:
-
For each input record, extract a set of key/value pairs that we care about from each record.
-
For each extracted key/value pair, combine it with other values that share the same key (perhaps filtering, aggregating, or transforming values in the process).
Let's rewrite our program to implement the application-specific logic of counting word frequencies for each document and summing these counts across documents in two functions that we'll call Map and Reduce. The result is .Example 23-5. Division of word counting problem into Map and Reducevoid Map(string document) { for each word w in document { EmitIntermediate(w, "1"); } } void Reduce(string word, list<string> values) { int count = 0; for each v in values { count += StringToInt(v); } Emit(word, IntToString(count)); }A simple driver program that uses these routines to accomplish the desired task on a single machine would look like .Example 23-6. Driver for Map and Reducemap<string, list<string> > intermediate_data; void EmitIntermediate(string key, string value) { intermediate_data[key].append(value); } void Emit(string key, string value) { ... write key/value to final data file ... } void Driver(MapFunction mapper, ReduceFunction reducer) { for each input item do { mapper(item) } for each key k in intermediate_data { reducer(k, intermediate_data[k]); } } main() { Driver(Map, Reduce); }Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Other MapReduce Examples
- InhaltsvorschauWe'll examine the implementation of a much more sophisticated driver program that automatically runs MapReduce programs on large-scale clusters of machines in a moment, but first, let's consider a few other problems and how they can be solved using Map-Reduce:
- Distributed grep
-
The Map function emits a line if it matches a supplied regular expression pattern. The Reduce function is an identity function that just copies the supplied intermediate data to the output.
- Reverse web-link graph
-
A forward web-link graph is a graph that has an edge from node URL1 to node URL2 if the web page found at URL1 has a hyperlink to URL2. A reverse web-link graph is the same graph with the edges reversed. MapReduce can easily be used to construct a reverse web-link graph. The Map function outputs <target, source> pairs for each link to a target URL found in a document named source. The Reduce function concatenates the list of all source URLs associated with a given target URL and emits the pair <target, list of source URLs>.
- Term vector per host
-
A term vector summarizes the most important words that occur in a document or a set of documents as a list of <word, frequency> pairs. The Map function emits a <hostname, term vector> pair for each input document (where the hostname is extracted from the URL of the document). The Reduce function is passed all per-document term vectors for a given host. It adds these term vectors, throwing away infrequent terms, and then emits a final <hostname, term vector> pair.
- Inverted index
-
An inverted index is a data structure that maps from each unique word to a list of documents that contain the word (where the documents are typically identified with a numeric identifier to keep the inverted index data relatively compact). The Map function parses each document and emits a sequence of
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - A Distributed MapReduce Implementation
- InhaltsvorschauMuch of the benefit of the MapReduce programming model is that it nicely separates the expression of the desired computation from the underlying details of parallelization, failure handling, etc. Indeed, different implementations of the MapReduce programming model are possible for different kinds of computing platforms. The right choice depends on the environment. For example, one implementation may be suitable for a small shared-memory machine, another for a large NUMA multiprocessor, and yet another for an even larger collection of networked machines.A very simple single-machine implementation that supports the programming model was shown in the code fragment in . This section describes a more complex implementation that is targeted to running large-scale MapReduce jobs on the computing environment in wide use at Google: large clusters of commodity PCs connected together with switched Ethernet (see "Further Reading," at the end of this chapter). In this environment:
-
Machines are typically dual-processor x86 processors running Linux, with 2–4 GB of memory per machine.
-
Machines are connected using commodity-networking hardware (typically 1 gigabit/ second switched Ethernet). Machines are organized into racks of 40 or 80 machines. These racks are connected to a central switch for the whole cluster. The bandwidth available when talking to other machines in the same rack is 1 gigabit/second per machine, while the per-machine bandwidth available at the central switch is much smaller (usually 50 to 100 megabits/second per machine).
-
Storage is provided by inexpensive IDE disks attached directly to individual machines. A distributed filesystem called GFS (see the reference to "The Google File System" under "Further Reading," at the end of this chapter) is used to manage the data stored on these disks. GFS uses replication to provide availability and reliability on top of unreliable hardware by breaking files into chunks of 64 megabytes and storing (typically) 3 copies of each chunk on different machines.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Extensions to the Model
- InhaltsvorschauAlthough most uses of MapReduce require just writing Map and Reduce functions, we have extended the basic model with a few features that we have found useful in practice.
- Partitioning function
-
MapReduce users specify the number of reduce tasks/output files that they desire (R). Intermediate data gets partitioned across these tasks using a partitioning function on the intermediate key. A default partitioning function is provided that uses hashing (
hash(key)%R) to evenly balance the data across the R partitions.In some cases, however, it is useful to partition data by some other function of the key. For example, sometimes the output keys are URLs, and we want all entries for a single host to end up in the same output file. To support situations like this, the users of the MapReduce library can provide their own custom partitioning function. For example, usinghash(Hostname(urlkey))%R as the partitioning function causes all URLs from the same host to end up in the same output file. - Ordering guarantees
-
Our MapReduce implementation sorts the intermediate data to group together all intermediate values that share the same intermediate key. Since many users find it convenient to have their Reduce function called on keys in sorted order, and we have already done all of the necessary work, we expose this to users by guaranteeing this ordering property in the interface to the MapReduce library.
- Skipping bad records
-
Sometimes there are bugs in user code that cause the Map or Reduce functions to crash deterministically on certain records. Such bugs may cause a large MapReduce execution to fail after doing large amounts of computation. The preferred course of action is to fix the bug, but sometimes this is not feasible; for instance, the bug may be in a third-party library for which source code is not available. Also, it is sometimes acceptable to ignore a few records, such as when doing statistical analysis on a large data set. Thus, we provide an optional mode of execution where the MapReduce library detects records that cause deterministic crashes and skips these records in subsequent re-executions, in order to make forward progress.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Conclusion
- InhaltsvorschauMapReduce has proven to be a valuable tool at Google. As of early 2007, we have more than 6,000 distinct programs written using the MapReduce programming model, and run more than 35,000 MapReduce jobs per day, processing about 8 petabytes of input data per day (a sustained rate of about 100 gigabytes per second). Although we originally developed the MapReduce programming model as part of our efforts to rewrite the indexing system for our web search product, it has shown itself to be useful across a very broad range of problems, including machine learning, statistical machine translation, log analysis, information retrieval experimentation, and general large-scale data processing and computation tasks.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Further Reading
- Inhaltsvorschau
- A more detailed description of MapReduce appeared in the OSDI ‘04 conference:
-
"MapReduce: Simplified Data Processing on Large Clusters." Jeffrey Dean and Sanjay Ghemawat. Appeared in OSDI '04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, December, 2004. Available from http://labs.google.com/papers/mapreduce.html.
- A paper about the design and implementation of the Google File System appeared in the SOSP ‘03 conference:
-
"The Google File System." Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 19th ACM Symposium on Operating Systems Principles, Lake George, NY, October, 2003. Available from http://labs.google.com/papers/gfs.html.
- A paper describing the general hardware infrastructure at Google appeared in IEEE Micro:
-
"Web Search for a Planet: The Google Cluster Architecture." Luiz Barroso, Jeffrey Dean, and Urs Hoelzle. IEEE Micro, Volume 23, Issue 2 (March 2003), pp. 22–28. Available from http://labs.google.com/papers/googlecluster.html.
- A language called Sawzall developed at Google for logs analysis runs on top of Map-Reduce:
-
"Interpreting the Data: Parallel Analysis with Sawzall." Rob Pike, Sean Dorward, Robert Griesemer, Sean Quinlan. Scientific Programming Journal Special Issue on Grids and Worldwide Computing Programming Models and Infrastructure 13:4, pp. 227– 298. Available from http://labs.google.com/papers/sawzall.html.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Acknowledgments
- InhaltsvorschauA number of people have made substantial contributions to the continued development and improvement of MapReduce, including Tom Annau, Matt Austern, Chris Colohan, Frank Dabek, Walt Drummond, Xianping Ge, Victoria Gilbert, Shan Lei, Josh Levenberg, Nahush Mahajan, Greg Malewicz, Russell Power, Will Robinson, Ioannis Tsoukalidis, and Jerry Zhao. MapReduce builds on a number of pieces of infrastructure developed at Google, including the Google File System and our cluster scheduling system. We would like to especially thank the developers of those systems. Finally, we thank all the users of Map-Reduce within Google's engineering organization for providing helpful feedback, suggestions, and bug reports.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Appendix: Word Count Solution
- InhaltsvorschauThis section contains the full C++ implementation of the word frequency counting example that was used in the early part of this chapter. The code can also be found on the O'Reilly web site for this book (http://www.oreilly.com/catalog/9780596510046):
#include "mapreduce/mapreduce.h" // User's map function class WordCounter : public Mapper { public: virtual void Map(const MapInput& input) { const string& text = input.value(); const int n = text.size(); for (int i = 0; i < n; ) { // Skip past leading whitespace while ((i < n) && isspace(text[i])) i++; // Find word end int start = i; while ((i < n) && !isspace(text[i])) i++; if (start < i) EmitIntermediate(text.substr(start,i-start),"1"); } } }; REGISTER_MAPPER(WordCounter); // User's reduce function class Adder : public Reducer { virtual void Reduce(ReduceInput* input) { // Iterate over all entries with the // same key and add the values int64 value = 0; while (!input->done()) { value += StringToInt(input->value()); input->NextValue(); } // Emit sum for input->key() Emit(IntToString(value)); } }; REGISTER_REDUCER(Adder); int main(int argc, char** argv) { ParseCommandLineFlags(argc, argv);MapReduceSpecification spec; // Store list of input files into "spec" for (int i = 1; i < argc; i++) { MapReduceInput* input = spec.add_input(); input->set_format("text"); input->set_filepattern(argv[i]); input->set_mapper_class("WordCounter"); } // Specify the output files: // /gfs/test/freq-00000-of-00100 // /gfs/test/freq-00001-of-00100 // ... MapReduceOutput* out = spec.output(); out->set_filebase("/gfs/test/freq"); out->set_num_tasks(100); out->set_format("text"); out->set_reducer_class("Adder"); // Optional: do partial sums within map // tasks to save network bandwidth out->set_combiner_class("Adder"); // Tuning parameters: use at most 2,000 // machines and 100 MB of memory per task spec.set_machines(2000); spec.set_map_megabytes(100); spec.set_reduce_megabytes(100); // Now run it MapReduceResult result; if (!MapReduce(spec, &result)) abort( ); // Done: 'result' structure contains info // about counters, time taken, number of // machines used, etc. return 0; }Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Chapter 24: Beautiful Concurrency
- InhaltsvorschauSimon Peyton JonesThe free lunch is over. We have grown used to the idea that our programs will go faster when we buy a next-generation processor, but that time has passed. While that next-generation chip will have more CPUs, each individual CPU will be no faster than the previous year's model. If we want our programs to run faster, we must learn to write parallel programs.Parallel programs execute in a nondeterministic way, so they are hard to test, and bugs can be almost impossible to reproduce. For me, a beautiful program is one that is so simple and elegant that it obviously has no mistakes, rather than merely having no obvious mistakes. If we want to write parallel programs that work reliably, we must pay particular attention to beauty. Sadly, parallel programs are often less beautiful than their sequential cousins; in particular they are, as we shall see, less modular.In this chapter, I'll describe Software Transactional Memory (STM), a promising new approach to programming shared-memory parallel processors that seems to support modular programs in a way that current technology does not. By the time we are done, I hope you will be as enthusiastic as I am about STM. It is not a solution to every problem, but it is a beautiful and inspiring attack on the daunting ramparts of concurrency.Here is a simple programming task.Write a procedure to transfer money from one bank account to another. To keep things simple, both accounts are held in memory: no interaction with databases is required. The procedure must operate correctly in a concurrent program, in which many threads may call
transfersimultaneously. No thread should be able to observe a state in which the money has left one account, but not arrived in the other (or vice versa).This example is somewhat unrealistic, but its simplicity allows us to focus in this chapter on what is new about the solution: the language Haskell and transactional memory. But first let us briefly look at the conventional approach.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - A Simple Example: Bank Accounts
- InhaltsvorschauHere is a simple programming task.Write a procedure to transfer money from one bank account to another. To keep things simple, both accounts are held in memory: no interaction with databases is required. The procedure must operate correctly in a concurrent program, in which many threads may call
transfersimultaneously. No thread should be able to observe a state in which the money has left one account, but not arrived in the other (or vice versa).This example is somewhat unrealistic, but its simplicity allows us to focus in this chapter on what is new about the solution: the language Haskell and transactional memory. But first let us briefly look at the conventional approach.The dominant technologies used for coordinating concurrent programs today are locks and condition variables. In an object-oriented language, every object has an implicit lock, and the locking is done by synchronized methods, but the idea is the same. So, one might define a class for bank accounts something like this:class Account { Int balance; synchronized void withdraw( Int n ) { balance = balance - n; } void deposit( Int n ) { withdraw( -n ); } }We must be careful to use asynchronizedmethod forwithdraw, so that we do not get any missed decrements if two threads callwithdrawat the same time. The effect ofsynchronizedis to take a lock on the account, runwithdraw, and then release the lock.Now, here is how we might write the code fortransfer:void transfer( Account from, Account to, Int amount ) { from.withdraw( amount ); to.deposit( amount ); }This code is fine for a sequential program, but in a concurrent program, another thread could observe an intermediate state in which the money has left accountfrombut has not arrived into. The fact that both methods aresynchronizedEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Software Transactional Memory
- InhaltsvorschauSoftware Transactional Memory is a promising new approach to the challenge of concurrency, as I will explain in this section. I shall explain STM using Haskell, the most beautiful programming language I know, because STM fits into Haskell particularly elegantly. If you don't know any Haskell, don't worry; we'll learn it as we go.Here is the beginning of the code for
transferin Haskell:transfer :: Account -> Account -> Int -> IO ( ) -- Transfer 'amount' from account 'from' to account 'to' transfer from to amount = ...
The second line of this definition, starting with --, is a comment. The first line gives the type signature fortransfer. This signature says thattransfertakes as its arguments two values of typeAccount(the source and destination accounts) and anInt(the amount to transfer), and returns a value of typeIO ( ). This result type says, "transferreturns an action that, when performed, may have some side effects, and then returns a value of type( )." The type( ), pronounced "unit," has just one value, which is also written( ); it is akin tovoidin C. So,transfer'sresult typeIO ( )announces that its side effects constitute the only reason for calling it. Before we go further, we must explain how side effects are handled in Haskell.A side effect is anything that reads or writes mutable state. Input/output is a prominent example of a side effect. For example, here are the signatures of two Haskell functions with input/output effects:hPutStr :: Handle -> String -> IO () hGetLine :: Handle -> IO String
We call any value of typeIO tan action. So, (hPutStrh "hello") is an action that, when performed, will printhelloon handlehand return the unit value. Similarly, (hGetLine h) is an action that, when performed, will read a line of input from handlehand return it as aString. We can glue together little side-effecting programs to make bigger side-effecting programs using Haskell'sEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - The Santa Claus Problem
- InhaltsvorschauI want to show you a complete, runnable concurrent program using STM. A well-known example is the so-called Santa Claus problem, originally attributed to Trono:Santa repeatedly sleeps until wakened by either all of his nine reindeer, back from their holidays, or by a group of three of his ten elves. If awakened by the reindeer, he harnesses each of them to his sleigh, delivers toys with them and finally unharnesses them (allowing them to go off on holiday). If awakened by a group of elves, he shows each of the group into his study, consults with them on toy R&D and finally shows them each out (allowing them to go back to work). Santa should give priority to the reindeer in the case that there is both a group of elves and a group of reindeer waiting.Using a well-known example allows you to directly compare my solution with well-described solutions in other languages. In particular, Trono's paper gives a semaphore-based solution that is partially correct. Ben-Ari gives a solution in Ada95 and in Ada. Benton gives a solution in Polyphonic C#.The basic idea of the STM Haskell implementation is this. Santa makes one "
Group" for the elves and one for the reindeer. Each elf (or reindeer) tries to join its Group. If it succeeds, it gets two "Gates" in return. The firstGateallows Santa to control when the elf can enter the study and also lets Santa know when they are all inside. Similarly, the secondGatecontrols the elves leaving the study. Santa, for his part, waits for either of his twoGroupsto be ready, and then uses thatGroup's Gatesto marshal his helpers (elves or reindeer) through their task. Thus the helpers spend their lives in an infinite loop: try to join a group, move through the gates under Santa's control, and then delay for a random interval before trying to join a group again.Rendering this informal description in Haskell gives the following code for an elf:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Reflections on Haskell
- InhaltsvorschauHaskell is, first and foremost, a functional language. Nevertheless, I think that it is also the world's most beautiful imperative language. Considered as an imperative language, Haskell's unusual features are that:
-
Actions (which have effects) are rigorously distinguished from pure values by the type system.
-
Actions are first-class values. They can be passed to functions, returned as results, formed into lists, and so on, all without causing any side effects.
Using actions as first-class values, the programmer can define application-specific control structures, rather than make do with the ones provided by the language designer. For example,nTimesis a simpleforloop, andchooseimplements a sort of guarded command. We also saw other applications of actions as values. In the main program, we used Haskell's rich expression language (in this case, list comprehensions) to generate a list of actions, which we then performed in order, usingsequence_. Earlier, when defininghelper1, we improved modularity by abstracting out an action from a chunk of code. To illustrate these points, I have perhaps overused Haskell's abstraction power in the Santa code, which is a very small program. For large programs, though, it is hard to overstate the importance of actions as values.On the other hand, I have underplayed other aspects of Haskell—higher-order functions, lazy evaluation, data types, polymorphism, type classes, and so on—because of the focus on concurrency. Not many Haskell programs are as imperative as this one! You can find a great deal of information about Haskell at http://haskell.org, including books, tutorials, Haskell compilers and interpreters, Haskell libraries, mailing lists, and much more besides.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Conclusion
- InhaltsvorschauMy main goal is to persuade you that you can write programs in a fundamentally more modular way using STM than you can with locks and condition variables. First, though, note that transactional memory allows us to completely avoid many of the standard problems that plague lock-based concurrent programs (as explained earlier in the section "Locks Are Bad"). None of these problems arises in STM Haskell. The type system prevents you from reading or writing a
TVaroutside an atomic block, and because there are no programmer-visible locks, the questions of which locks to take, and in which order, simply do not arise. Other benefits of STM, which I lack the space to describe here, include freedom from lost wakeups and the treatment of exceptions and error recovery.However, as we also discussed in the section "Locks Are Bad," the worst problem with lock-based programming is that locks do not compose. In contrast, any function with anSTMtype in Haskell can be composed, using sequencing or choice, with any other function with anSTMtype to make a new function ofSTMtype. Furthermore, the compound function will guarantee all the same atomicity properties that the individual functions did. In particular, blocking (retry) and choice (orElse), which are fundamentally non-modular when expressed using locks, are fully modular in STM. For example, consider this transaction, which uses functions we defined in the section "Blocking and Choice":atomically (do { limitedWithdraw a1 10 ; limitedWithdraw2 a2 a3 20 })This transaction blocks untila1contains at least 10 units, and eithera2ora3has 20 units. However, that complicated blocking condition is not written explicitly by the programmer, and indeed if thelimitedWithdrawfunctions are implemented in a sophisticated library, the programmer might have no idea what their blocking conditions are. STM is modular: small programs can be glued together to make larger programsEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Acknowledgments
- InhaltsvorschauI would like to thank those who helped me to improve the chapter with their feedback: Bo Adler, Justin Bailey, Matthew Brecknell, Paul Brown, Conal Elliot, Tony Finch, Kathleen Fisher, Greg Fitzgerald, Benjamin Franksen, Jeremy Gibbons, Tim Harris, Robert Helgesson, Dean Herington, David House, Brian Hulley, Dale Jordan, Marnix Klooster, Chris Kuklewicz, Evan Martin, Greg Meredith, Neil Mitchell, Jun Mukai, Michal Palka, Sebastian Sylvan, Johan Tibell, Aruthur van Leeuwen, Wim Vanderbauwhede, David Wakeling, Dan Wang, Peter Wasilko, Eric Willigers, Gaal Yahas, and Brian Zimmer. My special thanks go to Kirsten Chevalier, Andy Oram, and Greg Wilson for their particularly detailed reviews.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 25: Syntactic Abstraction: The syntax-case Expander
- InhaltsvorschauR. Kent DybvigWhen writing computer programs, certain patterns arise over and over again. For example, programs must often loop through the elements of arrays, increment or decrement the values of variables, and perform multiway conditionals based on numeric or character values. Programming language designers typically acknowledge this by including special-purpose syntactic constructs that handle the most common patterns. C, for instance, provides multiple looping constructs, multiple conditional constructs, and multiple constructs for incrementing or otherwise updating the value of a variable.Some patterns are less common but may occur frequently in a certain class of programs, or perhaps just within a single program. These patterns may not even be anticipated by a language's designers, who in any case would typically choose not to incorporate syntactic constructs to handle such patterns in the language core.Yet, recognizing that such patterns do arise and that special-purpose syntactic constructs can make programs both simpler and easier to read, language designers sometimes include a mechanism for syntactic abstraction, such as C's preprocessor macros or Common Lisp macros. When such facilities are absent or are inadequate for a specific purpose, an external tool, such as the m4 macro expander, may be brought to bear.Syntactic abstraction facilities differ in several significant ways. C's preprocessor macros are essentially token-based, allowing the replacement of a macro call with a sequence of tokens; text passed to the macro call is substituted for the macro's formal parameters, if any. Lisp macros are expression-based, allowing the replacement of a single expression with another expression, computed in Lisp itself and based on the subforms of the macro call, if any.In both cases, identifiers appearing within a macro-call subform are scoped where they appear in the output, rather than where they appear in the input, possibly leading to unintendedEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Brief Introduction to syntax-case
- InhaltsvorschauWe'll start with a few brief
syntax-caseexamples, adapted from the Chez Scheme Version 7 User's Guide (R. Kent Dybvig, Cadence Research Systems, 2005). Additional examples and a more detailed description ofsyntax-caseare given in that document and in The Scheme Programming Language, Third Edition.The following definition of or illustrates the form of asyntax-casemacro definition:(define-syntax or (lambda (x) (syntax-case x () [(_ e1 e2) (syntax (let ([t e1]) (if t t e2)))])))
Thedefine-syntaxform creates a keyword binding, associating the keywordorin this case with a transformation procedure, or transformer. The transformer is obtained by evaluating, at expansion time, thelambdaexpression on the righthand side of thedefine-syntaxform. Thesyntax-caseform is used to parse the input, and thesyntaxform is used to construct the output, via straightforward pattern matching. The pattern (_e1 e2) specifies the shape of the input, with the underscore (_) marking where the keyword or appears, and the pattern variablese1ande2bound to the first and second subforms. The template(let([te1]) (iftte2))specifies the output, withe1ande2inserted from the input.The form (syntaxtemplate) may be abbreviated to#'template, so the previous definition may be rewritten as follows:(define-syntax or (lambda (x) (syntax-case x () [(_ e1 e2) (syntax (let ([t e1]) (if t t e2))])))
Macros may also be bound within a single expression vialetrec-syntax.(letrec-syntax ([or (lambda (x) (syntax-case x () [(_ e1 e2) #'(let ([t e1]) (if t t e2))]))]) (or a b))
Macros can be recursive (i.e., expand into occurrences of themselves), as illustrated by the following version of or that handles an arbitrary number of subforms. Multiplesyntax-caseEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Expansion Algorithm
- InhaltsvorschauThe
syntax-caseexpansion algorithm is essentially a lazy variant of the KFFD algorithm that operates on an abstract representation of the input expression rather than on the traditional s-expression representation. The abstract representation encapsulates both a representation of an input form and a wrap that enables the algorithm to determine the scope of all identifiers within the form. The wrap consists of marks and substitutions.Marks are like KFFD timestamps and are added to the portions of a macro's output that are introduced by the macro.Substitutions map identifiers to bindings with the help of a compile-time environment. Substitutions are created whenever a binding form, such aslambda, is encountered, and they are added to the wraps of the syntax objects representing the forms within the scope of the binding form's bindings. A substitution applies to an identifier only if the identifier has the same name and marks as the substituted identifier.Expansion operates in a recursive, top-down fashion. As the expander encounters a macro call, it invokes the associated transformer on the form, marking it first with a fresh mark and then marking it again with the same mark. Like marks cancel, so only the introduced portions of the macro's output—i.e., those portions not simply copied from the input to the output—remain marked.When a core form is encountered, a core form in the output language of the expander (in our case, the traditional s-expression representation) is produced, with any subforms recursively expanded as necessary. Variable references are replaced by generated names via the substitution mechanism.The most important aspect of thesyntax-casemechanism is its abstract representation of program source code as syntax objects. As described above, a syntax object encapsulates not only a representation of the program source code but also a wrap that provides sufficient information about the identifiers contained within the code to implement hygiene:Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Example
- InhaltsvorschauWe now trace through the following example from the beginning of this chapter:
(let ([t #t]) (or #f t))
We assume thatorhas been defined to do the transformation given at the beginning of the chapter, using the equivalent of the following definition oforfrom the section "Brief Introduction tosyntax-case":(define-syntax or (lambda (x) (syntax-case x () [(_ e1 e2) #'(let ([t e1]) (if t t e2))])))
At the outset, the expander is presented with a syntax object whose expression is(let ([t #t]) (or #f t)), and the wrap is empty, except for the contents of the initial wrap, which we suppress for brevity. (We identify syntax objects by enclosing the expression and wrap entries, if any, in angle brackets.)<(let ((t #t)) (or #f t))>
The expander is also presented with the initial environment, which we assume contains a binding for the macrooras well as for the core forms and built-in procedures. Again, these environment entries are omitted for brevity, along with the meta environment, which plays no role here since we are not expanding any transformer expressions.Theletexpression is recognized as a core form becauseletis present in the initial wrap and environment. The transformer forletrecursively expands the righthand-side expression#tin the input environment, yielding#t. It also recursively expands the body with an extended wrap that mapsxto a fresh labell1:<(or #f t) [t x () → l1]>
Substitutions are shown with enclosing brackets, the name and list of marks separated by the symbol x, and the label following a right arrow.The environment is also extended to map the label to a binding of typelexicalwith the fresh namet.1:l1 → lexical(t.1)
Theorform is recognized as a macro call, so the transformer fororis invoked, producing a new expression to be evaluated in the same environment. The input of theEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Conclusion
- InhaltsvorschauThe simplified expander described here illustrates the basic algorithm that underlies a complete implementation of
syntax-case, without the complexities of the pattern-matching mechanism, handling of internal definitions, and additional core forms that are usually handled by an expander. The representation of environments is tailored to the single-bindinglambda, let, andletrec-syntaxforms implemented by the expander; a more efficient representation that handles groups of bindings would typically be used in practice. While these additional features are not trivial to add, they are conceptually independent of the expansion algorithm.Thesyntax-caseexpander extends the KFFD hygienic macro-expansion algorithm with support for local syntax bindings and controlled capture, among other things, and also eliminates the quadratic expansion overhead of the KFFD algorithm.The KFFD algorithm is simple and elegant, and an expander based on it could certainly be a beautiful piece of code. Thesyntax-caseexpander, on the other hand, is of necessity considerably more complex. It is not, however, any less beautiful, for there can still be beauty in complex software as long as it is well structured and does what it is designed to do.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Chapter 26: Labor-Saving Architecture: An Object-Oriented Framework for Networked Software
- InhaltsvorschauWilliam R. Otte and Douglas C. SchmidtDeveloping software for networked applications is hard, and developing reusable software for networked applications is even harder. First, there are the complexities inherent to distributed systems, such as optimally mapping application services onto hardware nodes, synchronizing service initialization, and ensuring availability while masking partial failures. These complexities can stymie even experienced software developers because they arise from fundamental challenges in the domain of network programming.Unfortunately, developers must also master accidental complexities, such as low-level and nonportable programming interfaces and the use of function-oriented design techniques that require tedious and error-prone revisions as requirements and/or platforms evolve. These complexities arise largely from limitations with the software tools and techniques applied historically by developers of networked software.Despite the use of object-oriented technologies in many domains, such as graphical user interfaces and productivity tools, much networked software still uses C-level operating system (OS) application programmatic interfaces (APIs), such as the Unix socket API or the Windows threading API. Many accidental complexities of networked programming stem from the use of these C-level OS APIs, which are not type-safe, often not reentrant, and not portable across OS platforms. The C APIs were also designed before the wide-spread adoption of modern design methods and technologies, so they encourage developers to decompose their problems functionally in terms of processing steps in a top-down design, instead of using OO design and programming techniques. Experience over the past several decades has shown that functional decomposition of nontrivial software complicates maintenance and evolution because functional requirements are rarely stable design centers.Fortunately, two decades of advances in design/implementation techniques and programming languages have made it much easier to write and reuse networked software. In particular, object-oriented (OO) programming languages (such as C++, Java, and C#) combined withEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Sample Application: Logging Service
- InhaltsvorschauThe OO software that we use as the basis of our case study is a networked logging service. As shown in , this service consists of client applications that generate log records and send them to a central logging server that receives and stores the log records for later inspection and processing.
Figure 26-1: Architecture of a networked logging serviceThe logging server portion (at the center of ) of our networked logging service provides an ideal context for demonstrating the beauty of OO networked software because it exhibits the following dimensions of design-time variability that developers can choose from when implementing such a server:-
Different interprocess communication (IPC) mechanisms (such as sockets, SSL, shared memory, TLI, named pipes, etc.) that developers can use to send and receive log records.
-
Different concurrency models (such as iterative, reactive, thread-per-connection, process-per-connection, various types of thread pools, etc.) that developers can use to process log records.
-
Different locking strategies (such as thread-level or process-level recursive mutex, nonrecursive mutex, readers/writer lock, null mutex, etc.) that developers can use to serialize access to resources, such as a count of the number of requests, shared by multiple threads.
-
Different log record formats that can be transmitted from client to server. Once received by the server, the log records can be handled in different ways—e.g., printed to console, stored to a single file, or even one file per client to maximize parallel writes to disk.
It is relatively straightforward to implement any one of these combinations, such as running one thread per connection-logging server using socket-based IPC and a thread-level nonrecursive mutex. A one-size-fits-all solution, however, is inadequate to meet the needs of all logging services because different customer requirements and different operating environments can have significantly different impacts on time/space trade-offs, cost, and schedule. A key challenge is therefore to design a configurable logging server that isEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Object-Oriented Design of the Logging Server Framework
- InhaltsvorschauBefore we discuss the OO design of our logging server, it is important to understand several key concepts about OO frameworks. Most programmers are familiar with the concept of a class library, which is a set of reusable classes that provides functionality that may be used when developing OO programs. OO frameworks extend the benefits of OO class libraries in the following ways:
- They define "semi-complete" applications that embody domain-specific object structures and functionality
-
Classes in a framework work together to provide a generic architectural skeleton for applications in a particular domain, such as graphical user interfaces, avionics mission computing, or networked logging services. Complete applications can be composed by inheriting from and/or instantiating framework components. In contrast, class libraries are less domain-specific and provide a smaller scope of reuse. For instance, class library components such as classes for strings, complex numbers, arrays, and bitsets are relatively low-level and ubiquitous across many application domains.
- Frameworks are active and exhibit "inversion of control" at runtime
-
Class libraries are typically passive—i.e., they perform isolated bits of processing when invoked by threads of control from self-directed application objects. In contrast, frameworks are active—i.e., they direct the flow of control within an application via event-dispatching patterns, such as Reactor and Observer. The "inversion of control" in the runtime architecture of a framework is often referred to as "The Hollywood Principle," which states "Don't call us, we'll call you."
Frameworks are typically designed by analyzing various potential problems that the framework might address and identifying which parts of each solution are the same and which areas of each solution are unique. This design method is calledEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Implementing Sequential Logging Servers
- InhaltsvorschauThis section demonstrates the implementation of logging servers that feature sequential concurrency models—i.e., all processing is performed in a single thread. We cover both iterative and reactive implementations of sequential logging servers.Iterative servers process all log records from each client before handling any log records from the next client. Since there is no need to spawn or synchronize threads, we use the
Null_Mutexfacade to parameterize theIterative_Logging_Serversubclass template, as follows:template <typename ACCEPTOR> class Iterative_Logging_Server : virtual Logging_Server<ACCEPTOR, Null_Mutex> { public: typedef Logging_Server<ACCEPTOR, Null_Mutex>::HANDLER HANDLER; Iterative_Logging_Server (int argc, char *argv[]); protected: virtual void open (void); virtual void wait_for_multiple_events (void) {}; virtual void handle_connections (void); virtual void handle_data (typename ACCEPTOR::PEER_STREAM *stream = 0); HANDLER log_handler_; // One log file shared by all clients. std::ofstream logfile_; };Implementing this version of our server is straightforward. Theopen( )method decorates the behavior of the method from theLogging_Serverbase class by opening an output file before delegating to the parent'sopen( ), as follows:template <typename ACCEPTOR> void Interative_Logging_Server<ACCEPTOR>::open (void) { logfile_.open (filename_.c_str ()); if (!logfile_.good ()) throw std::runtime_error; // Delegate to the parent's open() method. Logging_Server<ACCEPTOR, Null_Mutex>::open (); }Thewait_for_multiple_events( )method is a no-op. It is not needed because we just handle a single connection at any one time. Thehandle_connections( )method therefore simply blocks until a new connection is established, as follows:template <typename ACCEPTOR> void Iterative_Logging_Server<ACCEPTOR>::handle_connections (void) { acceptor_.accept (log_handler_.peer ()); }Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Implementing Concurrent Logging Servers
- InhaltsvorschauTo overcome the scalability limitations of the iterative and reactive servers shown in the previous sections, the logging servers in this section use OS concurrency mechanisms: processes and threads. Using the APIs provided by operating systems to spawn threads or processes, however, can be a daunting task due to accidental complexities in their design. These complexities stem from semantic and syntactic differences that exist not only between different operating systems, but also different versions of the same operating system. Our solution to these complexities is again to apply wrapper facades that provide a consistent interface across platforms and integrate these wrapper facades into our OO
Logging_Serverframework.Our thread-per-connection logging server (TPC_Logging_Server) runs a main thread that waits for and accepts new connections from clients. After accepting a new connection, a new worker thread is spawned to handle incoming log records from that connection. shows the steps in this process.
Figure 26-10: Steps in the thread-per-connection logging serverThe main loop for this particular logging server differs from the steps depicted in because the call tohandle_data( )is not necessary, as the worker threads are responsible for that call. There are two ways to handle this situation:-
We could note that the base
run( )method callshandle_data( )with the default argument of aNULLpointer, and simply have our implementation exit immediately for that input. -
We could simply override the
run( )method with our own implementation that omits this call.
The second solution may at first appear advantageous because it avoids a virtual method call toEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Conclusion
- InhaltsvorschauThe logging server application presented in this chapter provides a digestible but realistic vehicle for showing how to apply OO design/programming techniques, patterns, and frameworks to implementation software for networked applications. In particular, our OO framework demonstrates a number of beautiful design elements, ranging from abstract design to concrete elements in the implementations of the different concurrency models. Our design also uses C++ features, such as templates and virtual functions, in conjunction with design patterns, such as Wrapper Facade and the Template Method, to create a family of logging servers that is portable, reusable, flexible, and extensible.The Template Method pattern in the
Logging_Serverbase class'srun( )method allowed us to define common steps in a logging server, deferring specialization of individual steps of its operation to hook methods in derived classes. While this pattern helped factor out common steps into the base class, it did not adequately address all our required points of variability, such as synchronization and IPC mechanisms, For these remaining dimensions, therefore, we used the Wrapper Facade pattern to hide semantic and syntactic differences, ultimately making use of these dimensions entirely orthogonal to the implementation of individual concurrency models. This design allowed us to use parameterized classes to address these dimensions of variability, which increased the flexibility of our framework without affecting its performance adversely.Finally, our individual implementations of concurrency models, such as thread-per-connection and process-per-connection, used wrapper facades to make their implementations more elegant and portable. The end result was a labor-saving software architecture that enabled developers to reuse common design and programming artifacts, as well as provide a means to encapsulate variabilities in a common, parameterizable way.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Chapter 27: Integrating Business Partners the RESTful Way
- InhaltsvorschauAndrew PatzerA few years back, when i was a consultant, I went through a period of a year or two when it seemed that every client I spoke with was absolutely certain he needed a Web Services solution for his business. Of course, not many of my clients actually understood what that meant or the reasons why they might need that kind of architecture, but since they kept hearing about Web Services on the Internet, in magazines, and at trade shows, they figured they'd better get on the bus before it was too late.Don't get me wrong. I'm not against Web Services. I'm just not a big fan of making technical decisions based solely on whatever happens to be in style at the moment. This chapter will address some of the reasons for using a Web Services architecture, as well as explore some of the options to consider when integrating systems with the outside world.In this chapter, I'll walk you through a real-life project that involves exposing a set of services to a business partner, and discuss some design choices that were made along the way. Technologies that were used included Java (J2EE), XML, the Rosettanet E-Business protocol, and a function library used to communicate with a program running on an AS/400 system. I will also cover the use of interfaces and the factory design pattern as I show how I made the system extensible for future distributors who may use different protocols and may need to access different services.The project I'll be discussing in this chapter began with a call from one of our clients: "We need a set of Web Services to integrate our systems with one of our distributors." The client was a large manufacturer of electrical components. The system they were referring to was MAPICS, a manufacturing system written in RPG and running on their AS/400 machines. Their main distributor was upgrading its own business systems and needed to modify the way it tied into the order management system to check product availability and order status.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Project Background
- InhaltsvorschauThe project I'll be discussing in this chapter began with a call from one of our clients: "We need a set of Web Services to integrate our systems with one of our distributors." The client was a large manufacturer of electrical components. The system they were referring to was MAPICS, a manufacturing system written in RPG and running on their AS/400 machines. Their main distributor was upgrading its own business systems and needed to modify the way it tied into the order management system to check product availability and order status.Previously, an operator at the distributor simply connected remotely to the manufacturer's AS/400 system and pressed a "hot key" to access the necessary screens (F13 or F14 as I recall). As you'll see in the code later, the new system I developed for them was named hotkey, since that had become part of their common language, much like the way google has become a modern-day verb.Now that the distributor was implementing a new e-business system, it required an automated way to integrate manufacturer data with its own system. Since this was just a single distributor for this client, albeit the largest, the system also needed to allow for the future addition of other distributors and whatever protocols and requirements they might have. Another factor was the relatively low skill level of the staff that would maintain and extend this software. While they were very good in other areas, Java development (as well as any kind of web development) was still very new to them. So, I knew that whatever I built had to be simple and easy to extend.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Exposing Services to External Clients
- InhaltsvorschauPrior to this project, I had delivered several technical presentations to user groups and conferences regarding SOAP (Simple Object Access Protocol) and web service architecture. So, when the call came in, it seemed I'd be a natural fit for what the client was looking to accomplish. Once I understood what they really needed, though, I decided that they would be much better off with a set of services exposed through simple GET and POST requests over HTTP, exchanging XML data describing the requests and responses. Although I didn't know it at the time, this architectural style is now commonly referred to as REST, or Representational State Transfer.How did I decide to use REST over SOAP? Here are a few of the decision points to consider when choosing a Web Services architecture:
- How many different systems will require access to these services, and are all of them known at this time?
-
This manufacturer knew of a single distributor that needed to access its systems, but also acknowledged that others might decide to do the same in the future.
- Do you have a tight set of end users that will have advance knowledge of these services, or do these services need to be self-describing for anonymous users to automatically connect to?
-
Because there has to be a defined relationship between the manufacturer and all its distributors, it is guaranteed that each of the potential users will have advance knowledge of how to access the manufacturer's systems.
- What kind of state needs to be maintained throughout a single transaction? Will one request depend on the results of a previous one?
-
In our case, each transaction will consist of a single request and a corresponding result that doesn't depend on anything else.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Routing the Service Using the Factory Pattern
- InhaltsvorschauOne of the requirements of this system was that it be able to accommodate a wide variety of future requests from several different types of systems with minimal programming effort. I believe I accomplished this by simplifying the implementation down to a single command interface that exposed the basic methods needed to respond to a wide variety of requests:
public interfaceHotkeyAdaptor {public void setXML(String _xml); public boolean parseXML(); public boolean executeQuery(); public String getResponseXML(); }
So, how does the servlet decide which concrete implementation of the interface to instantiate? It first looks inside the request data for a specific string to tell it what type of request it is. Then, it uses a static method of a factory object to pick the appropriate implementation.As far as the servlet knows, whatever implementation we're using will provide appropriate responses to each of these methods. By using an interface in the main servlet, we only have to write the execution code once, without any regard to which type of request it's dealing with or who may have made the request. All of the details are encapsulated in each individual implementation of this interface. Here's that snippet of code again from the servlet:HotkeyAdaptor hotkey =null; if (sb.toString().indexOf("Pip3A2PriceAndAvailabilityRequest") > 0) { // Price and Availability Request hotkey = HotkeyAdaptorFactory.getAdaptor( HotkeyAdaptorFactory.ROSETTANET, HotkeyAdaptorFactory.PRODUCTAVAILABILITY); } else if (sb.toString().indexOf("Pip3A5PurchaseOrderStatusQuery ") > 0) { // Order Status hotkey = HotkeyAdaptorFactory.getAdaptor( HotkeyAdaptorFactory.ROSETTANET, HotkeyAdaptorFactory.ORDERSTATUS); }
The factory object,HotkeyAdaptorFactory, has a static method that takes two parameters telling it which XML protocol to use and what type of request it is. These are defined as static constants in the factory object itself. As you can see by the following code, the factory object simply uses aEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Exchanging Data Using E-Business Protocols
- InhaltsvorschauSomething that was new to me on this project was the use of standard e-business protocols. When the distributor informed me of the requirement to exchange requests and responses using the Rosettanet standard, I had to do a little research. I started by going to the Rosettanet web site (http://www.rosettanet.org) and downloading the specific standards I was interested in. I found a diagram detailing a typical exchange between business partners, along with a specification for the XML request and response.Since I had a lot of trial-and-error type work to do, the first thing I did was set up a test that I could run myself to simulate an interaction with the distributor without having to coordinate testing with their staff for each iteration of development. I used the Apache Commons
HttpClientto manage the HTTP exchanges:public class TestHotKeyService {public static void main (String[] args) throws Exception { String strURL = "http://xxxxxxxxxxx/HotKey/HotKeyService"; String strXMLFilename = "SampleXMLRequest.xml"; File input = new File(strXMLFilename); PostMethod post = new PostMethod(strURL); post.setRequestBody(new FileInputStream(input)); if (input.length( ) < Integer.MAX_VALUE) { post.setRequestContentLength((int)input.length()); } else { post.setRequestContentLength( EntityEnclosingMethod.CONTENT_LENGTH_CHUNKED); } post.setRequestHeader("Content-type", "text/xml; charset=ISO-8859-1"); HttpClient httpclient = new HttpClient(); System.out.println("[Response status code]: " + httpclient.executeMethod(post)); System.out.println("\n[Response body]: "); System.out.println("\n" + post.getResponseBodyAsString( )); post.releaseConnection(); } }
This allowed me to accelerate my learning curve as I tried out several different types of requests and examined the results. I'm a firm believer in diving right into coding as soon as possible. You can only learn so much from a book, an article on a web site, or a set of API documents. By getting your hands dirty early on in the process, you'll uncover a lot of things you may not have thought about by simply studying the problem.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Conclusion
- InhaltsvorschauIn looking back on this code I wrote over two years ago, I think it's pretty normal to second-guess myself and think of better ways I could have written it. While I may have written some of the implementation code differently, I think I'd still design the overall solution the same way. This code has stood the test of time, as the client has since added new distributors and new request types all on its own, with minimal help from outside service providers like me.Currently, as director of a bioinformatics department, I have used this code to demonstrate several things to my staff as I teach them object-oriented design principles and XML parsing techniques. I could have written about code I have developed more recently, but I think this demonstrates several basic principles that are important for any young software developer to understand.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 28: Beautiful Debugging
- InhaltsvorschauAndreas ZellerMy name is andreas, and i have been debugging." Welcome to Debuggers Anonymous, where you can tell your debugging story and find relief in the stories of others. So you have spent another night away from home? Good thing you've only been in front of the debugger. So you still cannot tell your manager when that showstopper will be fixed? Let's hope for the best! The fellow in the cubicle next to you brags about searching for a bug for 36 consecutive hours? Now that's impressive!No, there is nothing glamorous about debugging. It is the ugly duckling of our profession, the admission that we are far from perfect, the one activity that is the least predictable or accountable—and a constant impetus to feelings of remorse and guilt: "If only we had done better right from the start, we wouldn't be stuck in this mess." The defect is the crime; debugging is the punishment.Let us assume, though, that we have done all we can to prevent errors. Yet, from time to time, we will find ourselves in a situation where we need to debug. And as with all other activities, we need to handle debugging in the most professional, maybe even "beautiful" way.So, can there be any beauty in debugging? I believe there can. In my own life as a programmer, there have been a number of moments when I encountered true beauty in debugging. These moments not only helped me solve a problem at hand, but actually evolved into new approaches to debugging as a whole—approaches that are not only "beautiful" in some way, but actually boost your productivity in debugging. This is because they are systematic—they guarantee to guide you toward the problem solution—and partly even automatic—they do all the work while you pursue other tasks.Curious? Read on.My first experience of beauty in debugging was granted by one of my students. In her 1994 Master's thesis, Dorothea Lütkehaus built a visual debugger interface that provided a textbook visualization of data structures. shows a screenshot of her tool, called theEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Debugging a Debugger
- InhaltsvorschauMy first experience of beauty in debugging was granted by one of my students. In her 1994 Master's thesis, Dorothea Lütkehaus built a visual debugger interface that provided a textbook visualization of data structures. shows a screenshot of her tool, called the data display debugger, or ddd for short. As Dorothea demoed her debugger, the audience and myself were amazed: one could grasp complex data within seconds, and explore and manipulate it just by using the mouse.ddd was a wrapper for the command-line debuggers in use at this time (in particular gdb, the GNU debugger), which were very powerful tools but difficult to use. Since graphical user interfaces for programming tools were still scarce, ddd was a small revolution. In the following months, Dorothea and I did our best to make ddd the most beautiful debugger interface around, and it eventually became part of the GNU ecosystem.While debugging with ddd is usually more fun than using a command-line tool, it does not necessarily make you a more efficient debugger. For this, the debugging process is far more important than the tool. Incidentally, I learned this through ddd as well. It all started with a ddd bug report I received on July 31, 1998, which started my second experience of beauty in debugging. Here is the bug report:When using DDD with GDB 4.16, the run command correctly uses any prior command-line arguments, or the value of set args. However, when I switched to GDB 4.17, this no longer worked: If I entered a run command in the console window, the prior command-line options would be lost.Those gdb developers had done it again—releasing a new gdb version that behaved slightly differently from the earlier version. Because ddd was a frontend, it actually sent commands to gdb, just like a human would do, and parsed the gdb replies to present its information in the user interface. Something in this process had apparently failed.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- A Systematic Process
- InhaltsvorschauWhen programmers debug a program, they search for a failure cause, which could lie in the code, the input, or the environment. This cause must be found and eliminated. Once the cause is eliminated, the program works. (Should the program still fail once the cause is eliminated, we may need to revise our beliefs about the cause.)The general process for finding causes is called the scientific method. Applied to program failures, it works as follows:
-
Observe a program failure.
-
Invent a hypothesis for the failure cause that is consistent with the observations.
-
Use the hypothesis to make predictions.
-
Put your predictions to the test by experiments and further observations:
-
If experiment and observation satisfy the prediction, refine the hypothesis.
-
If not, find an alternate hypothesis.
-
-
Repeat steps 3 and 4 until the hypothesis can no longer be refined.
Eventually, the hypothesis thus becomes a theory. This means that you have a conceptual framework that explains (and predicts) some aspect of the universe. Your failing program may be a very small aspect of the universe, but still, the resulting theory should neatly predict where you should fix your program.To obtain such a theory, programmers apply the scientific method as they walk back through the cause-effect-chain that led to the failure. They:-
Observe a failure ("The output is wrong").
-
Come up with a hypothesis about the failure cause ("The problem may be that
yhas a wrong value"). -
Make a prediction ("If
yis wrong, its value may come fromf()in line 632").
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- A Search Problem
- InhaltsvorschauLet's come back to our initial problem of debugging the debugger. Even after isolating and fixing the bug, I wondered: is there a way to find the failure-inducing change automatically? What one would need is a test that is automatically invoked each time the programmer changes something. As soon as the test broke, we would know what had changed most recently so we could immediately fix it. (A few years later, David Saff and Michael Ernst implemented this idea under the name continuous testing.)In my situation, I knew the change that had caused the test to fail—it was the change from gdb 4.16 to gdb 4.17. The problem was that the change was so huge, affecting 8,721 locations. There should be a way to break down this change into smaller pieces.What if I tried to split it into 8,721 smaller changes, each affecting just one location? This way, I could apply and test one change after the other until the test failed—and the last change applied would be the one that broke the test. In other words, I would simulate the 4.17 version's development history. (Actually, it would not be me who would simulate the history; instead, it would be a tool I built. And while I would be sitting sipping my tea, playing a game with my kids, or catching up my email stream, this nifty little tool would search and find the failure-inducing change. Neat.)There was a catch, though. I had no clue about the order in which the changes had to be applied. And this was crucial because the individual changes may depend on each other. For instance, change A may introduce a variable that would be used in new code included in other changes B or C. Whenever B or C are applied, A must be applied, too; otherwise, building gdb would fail. Likewise, change X may rename some function definition; other changes (Y, Z) in other locations may reflect this renaming. If X is applied, Y and Z must be applied as well, because again, otherwise,Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Finding the Failure Cause Automatically
- InhaltsvorschauIt took me three months to come up with a solution—which came to me, incidentally, lying in my bed at six in the morning. The sun was rising, the birds were singing, and I finally had an idea. The reasoning was as follows:
-
Applying half of the changes has a small chance of getting a consistent build; the risk of skipping a dependent change is simply too high. On the other hand, if we get a consistent build (or a "resolved" outcome), we can very quickly narrow down the set of changes.
-
On the other hand, applying individual changes has a far greater chance of getting something meaningful—in particular, if the version being changed was already consistent. As an example, think of changing a single function; unless its interface changes, we will most likely get a running program. On the other hand, trying out one change after the other would still take ages.
Therefore, I'd call for a compromise: I would start with two halves. If neither half of the changes would result in a testable build, I would split the set of changes into four subsets instead, and then apply each subset individually to the gdb 4.16 source. In addition, I would also unapply the subset from the gdb 4.17 source (which would be realized by applying the complement of the subset to the gdb 4.16 source).Splitting in four (rather than two) would mean that smaller change sets would be applied, which means that the changed versions would be closer to the (working) original versions—and which would imply higher chances of getting a consistent build.And if four subsets would not suffice, then I would go for 8, 16, 32, and so on, until, eventually, I would apply each single change individually, one after the other—which should give me the greatest chance of getting a consistent build. As soon as I had a testable build, the algorithm would restart from scratch.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Delta Debugging
- InhaltsvorschauOver the next several months, I refined and optimized the algorithm as well as the tool, such that eventually it would need just one hour to find the failure-inducing change. I eventually published the algorithm under the name delta debugging because it "debugs" programs by isolating a delta, or difference between two versions.Here I'll show my Python implementation of the delta debugging algorithm. The function
dd()takes three arguments—two lists of changes as well as a test:-
The list
c_passcontains the "working" configuration—in our case, the list of changes that must be applied to make the program work. In our case (which is typical), this is the empty list. -
The list
c_failcontains the "failing" configuration—in our case, the list of changes required to make the program fail. In our case, this would be a list of 8,721 changes (which we would encapsulate in, say,Changeobjects). -
The
testfunction accepts a list of changes, applies them, and runs a test. It returnsPASS, FAIL, orUNRESOLVEDas the outcome, depending on whether the test was successful, failed, or had an unresolved outcome. In our case, thetestfunction would apply the changes via patch and run the test as just described. -
The
dd( )function systematically narrows down the difference betweenc_passandc_fail, and eventually returns a triple of values. The first of these values would be the isolateddelta—in our case, a singleChangeobject containing the one-line change to the gdb source code.
If you plan to implementdd()yourself, you can easily use the code shown here (and included on the O'Reilly web site for this book). You also need three supporting functions:split(list, n)
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Minimizing Input
- InhaltsvorschauThe interesting thing about delta debugging (or any other automation of the scientific method) is that it is very general. Rather than search for causes in a set of changes, you can also search for causes in other search spaces. For instance, you can easily apply delta debugging to search for failure causes in program input, which Ralf Hildebrandt and I did in 2002.When searching for causes in program input, the program code stays the same: no application of changes, no reconstruction, just execution. Instead, it is the input that changes. Think of a program that works on most inputs, but fails on one specific input. One can easily have delta debugging isolate the failure-inducing difference between the two inputs: "The cause of the web browser crashing is the
<SELECT>tag in line 40."One can also modify the algorithm so that it returns a minimized input: "To have the web browser crash, just feed it a web page containing<SELECT>." In minimized input, every single remaining character is relevant for the failure to occur. Minimized inputs can be very valuable for debuggers because they make things simple: they lead to shorter executions and smaller states to be examined. As an important (and perhaps beautiful) side effect, they capture the essence of the failure.I once met some programmers who were dealing with bugs in a third-party database. They had very complex, machine-generated SQL queries that sometimes would cause the database to fail. The vendor did not categorize these bugs as high priority because "you are our only customer dealing with such complex queries." Then the programmers simplified a one-page SQL query to a single line that still triggered the failure. Faced with this single line, the vendor immediately gave the bug the highest priority and fixed it.How does one achieve minimization? The easiest way is to feeddd()with an emptyc_pass, and to have a passing test return "pass" only if the input is empty, and "unresolved" otherwise.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Hunting the Defect
- InhaltsvorschauIn principle, delta debugging could also minimize an entire program code so as to keep only what is relevant. Suppose your web browser crashes while printing a HTML page. Applying delta debugging on the program code means that only the bare code required to reproduce the failure would remain. Doesn't this sound neat? Unfortunately, it would hardly work in practice. The reason is that the elements of program code are heavily dependent on each other. Remove one piece, and everything breaks apart. The chances of getting something meaningful by randomly removing parts are very slim. Therefore, delta debugging would almost certainly require a quadratic number of tests. Even for a 1,000-line program, this would already mean a million tests—and years and years of waiting. We never had that much time, so we never implemented that feat.Nonetheless, we still wanted to hunt down failure causes not only in the input or the set of changes, but in the actual source code—in other words, we wanted to have the statements that caused the program to fail. (And, of course, we wanted to get them automatically.)Again, this was a task that turned out to be achievable by delta debugging. However, we did not get there directly. We wanted to make a detour via program states—that is, the set of all program variables and their values. In this set, we wanted to determine failure causes automatically, as in "At the call to
shell_sort(), variable size causes the failure." How would that work?Let us recapitulate what we had done so far. We had done delta debugging on program versions—one that worked and one that failed—and isolated the minimal difference that caused the failure. We had done delta debugging on program inputs—again, one that worked and one that failed—and isolated minimal differences that caused the failure. If we applied delta debugging on program states, we would take one program state from a working run, and one program state from a failing run, and eventually obtain the minimal difference that caused the failure.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - A Prototype Problem
- InhaltsvorschauThere is a difference between what you can do in a lab and what you can do in production. The main trouble with our approach was that it was fragile. Very fragile. Extracting accurate program states is a tricky business. Suppose you are working on a C program that just stopped in a debugger. You find a pointer. Does it point to something? If so, what is the C data type of the variable is it pointing to? And how many elements does it point to? In C, all this is left to the discretion of the programmer—and it's awfully hard to guess the memory management used in the program at hand.Another problem is to determine where the program state ends and the system state begins. Some state is shared between applications, or between applications and the system. Where would we stop extracting and comparing?For lab experiments, these issues could be addressed and confined, but for a full-fledged, robust industrial approach, we found them insurmountable. And this is why, today, people still have to use interactive debuggers.The future is not that bleak, though. Command-line tools that implement delta debugging on input are available. The ddchange plug-in for Eclipse brings delta debugging on changes to your desktop. Researchers apply delta debugging on method calls, nicely integrating capture/replay with test case minimization. And finally, through all these automated approaches, we have gained a much better understanding of how debugging works and how it can be done in a systematic, sometimes even automatic way—a way that is hopefully most effective, and maybe even "beautiful."Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Conclusion
- InhaltsvorschauIf by any chance, you are forced to debug something, you can strive to make your debugging experience as painless as possible. Being systematic (by following the scientific method) helps a lot. Automating the scientific method helps even more. The best thing you can do, though, is invest effort into your code and your development process. By following the advice in this very book, you will write beautiful code—and, as a side effect, also achieve the most beautiful debugging. And what is the most beautiful debugging? Of course: no debugging at all!Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Acknowledgments
- InhaltsvorschauI'd like to thank the students with whom I have experienced beauty in debugging tools. Martin Burger took great part in the AskIgor effort and implemented the ddchange plug-in for Eclipse. Holger Cleve researched and implemented automated isolation of failure-inducing statements. Ralf Hildebrandt implemented isolation of failure-inducing input. Karsten Lehmann contributed to AskIgor and implemented isolation of failure-inducing program states for Java. Dorothea Lütkehaus wrote the original version of ddd. Thomas Zimmermann implemented the graph-comparison algorithms. Christian Lindig and Andrzey Wasylkowski provided helpful comments on earlier revisions of this chapter.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Further Reading
- InhaltsvorschauI have compiled my experiences on systematic and automatic debugging in a university course. This is where you can learn more about the scientific method and delta debugging—but also about many more debugging and analysis techniques, such as statistical debugging, automated testing, or static bug detection. All lecture slides and references are available at http://www.whyprogramsfail.com.If you are looking specifically for scientific publications of my group, see the delta debugging home page at http://www.st.cs.uni-sb.de/dd.Finally, a web search on "delta debugging" will point you to a variety of resources, including further publications and implementations.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 29: Treating Code As an Essay
- InhaltsvorschauYukihiro MatsumotoPrograms share some attributes with essays. For essays, the most important question readers ask is, "What is it about?" For programs, the main question is, "What does it do?" In fact, the purpose should be sufficiently clear that neither question ever needs to be uttered. Still, for both essays and computer code, it's always important to look at how each one is written. Even if the idea itself is good, it will be difficult to transmit to the desired audience if it is difficult to understand. The style in which they are written is just as important as their purpose. Both essays and lines of code are meant—before all else—to be read and understood by human beings.You may ask: "Are human beings actually supposed to be the ones reading computer programs?" The assumption is that people use programs to tell computers what to do, and computers then use compilers or interpreters to compile and understand the code. At the end of the process, the program is translated into machine language that is normally read only by the CPU. That is, of course, the way things work, but this explanation only describes one aspect of computer programs.Most programs are not write-once. They are reworked and rewritten again and again in their lives. Bugs must be debugged. Changing requirements and the need for increased functionality mean the program itself may be modified on an ongoing basis. During this process, human beings must be able to read and understand the original code; it is therefore more important by far for humans to be able to understand the program than it is for the computer.Computers can, of course, deal with complexity without complaint, but this is not the case for human beings. Unreadable code will reduce most people's productivity significantly. On the other hand, easily understandable code will increase it. And we see beauty in such code.What makes a computer program readable? In other words, what is beautiful code? Although different people have different standards about what a beautiful program might be, judging the attributes of computer code is not simply a matter of aesthetics. Instead, computer programs are judged according to how well they execute their intended tasks. In other words, "beautiful code" is not an abstract virtue that exists independent of its programmers' efforts. Rather, beautiful code is really meant to help the programmer be happy and productive. This is the metric I use to evaluate the beauty of a program.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 30: When a Button Is All That Connects You to the World
- InhaltsvorschauEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Basic Design Model
- InhaltsvorschauNeedless to say, the software needed to be efficient, so that the user can type quickly without having to click too often. It sometimes takes Professor Hawking minutes to type a single word, so every improvement in editing speed would be useful for a busy man.The software certainly needed to be highly customizable. The nature and size of the vocabulary of our users might vary vastly. The software would need to be able to adapt to these. Further, we were keen to ensure that the disabled person could change as many settings and configurations as possible herself, without the intervention of a helper.Since we had so little by way of job specification to go on, and no experience in writing such software, we expected to make fairly serious changes in the design as our understanding grew. Keeping all these requirements in mind, we decided to write the software in Visual Basic version 6, an excellent rapid prototyping tool with a large variety of ready-made controls. VB made it easy to build a graphical user interface and provided convenient access to database features.Unique to this problem was the unusually high asymmetry in data flow. A user who could see reasonably well would have a large capacity for taking in information. From persons with extreme motor disability, however, very little data flowed in the other direction: just the occasional bit.The software offers choices one by one to the user, who accepts a choice by clicking when the desired one is presented. The problem is, of course, that there are so many choices at any point. She may wish to type any one of dozens of characters, or save, scroll, find, or delete text. It would take too long if eLocutor were to cycle through all choices, so it organizes them in groups and subgroups, structured as a tree.To speed up typing, eLocutor looks ahead, offering ways to complete the word being typed, and choices for the next word and the rest of the phrase. The user needs to be kept aware of these guesses, so that he can spot opportunities for a shortcut, should one become available.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Input Interface
- InhaltsvorschauAs the single binary input, we selected the right mouse button. This allowed a variety of buttons to easily be connected to eLocutor. By opening up the mouse and soldering the desired button in parallel with the right mouse button, any electrician or hobbyist should be able to make the connection.shows how we made a temporary connection for Professor Hawking's special switch: the circuit board at the left bottom is taken from the inside of a mouse, and the points at which the external switch was soldered are the ones where the right mouse button is connected.
Figure 30-2: Connecting Professor Hawking's switch in parallel to the right mouse buttonIf you can provide the software only a single binary input, one part of the graphic user interface is obvious: all choices have to be presented turn by turn in the form of a binary tree. At each node, if the user clicks within a fixed time, the interface selects it, which might open up further choices in the form of a subtree. If the user does not click, the software automatically takes him to the next sibling of the node and waits again for a click.To implement this tree, we used the Visual Basic TreeView control. This should be looked upon as a tree that grows from left to right. If, at any node, you click within a user-selected time interval—which is set using a Timer control—you expand the node and climb up the tree (i.e., move to the right), or, if it is a leaf node, carry out some action. If you don't click, eLocutor shifts its focus to the next sibling of the node. If the bottom is reached without a click, eLocutor starts again with the node at the top.We populated the tree such that it provides, at each level in the tree, a node called Up that, if selected, takes the highlight to its parent, one level closer to the root.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Efficiency of the User Interface
- InhaltsvorschauIn order to help in evaluating the efficiency of eLocutor in helping you type, the bottom right of the screen shows two numbers (see ). These indicate the number of clicks and the number of seconds between the last click and the first, since the middle box was last empty.We found that when the prediction worked reasonably well, the ratio of clicks to characters typed was better than 0.8—i.e., it usually required significantly fewer clicks than an able-bodied person would have needed using a full keyboard. When prediction was poor—for instance when constructing a sentence radically different from any in the database—it required up to twice as many clicks as characters typed.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Download
- InhaltsvorschauELocutor is free, open source software downloadable from http://holisticit.com/eLocutor/elocutorv3.htm. A discussion list is located at http://groups.yahoo.com/group/radiophony.Part of the download is the entire source code. I might warn you, though, that it bears some resemblance to a bowl of spaghetti, for which I bear full responsibility. I hadn't programmed for more than 10 years when I started this project, so my skills were outdated and rusty. I did not have a design, only a few hints now and then on the direction in which it should evolve. The code grew with my understanding of the problem, and the results show it. Very simple programming techniques have been used, as is obvious from the code shown in this chapter.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Future Directions
- InhaltsvorschaueLocutor was always intended as a rapid application development (RAD) project, something that would allow me to show Professor Hawking during our infrequent and short meetings how far the design had progressed. The intention was to rewrite it some time, after the design could be frozen in the shape of a working prototype that people had actually been using and providing feedback on. At that point, I would pick a programming language that worked across platforms, so that Macs or Linux should also be accessible to those severely motor disabled.However, inspired by what T. V. Raman achieved with Emacspeak (covered in ), I am now considering an entirely different kind of project. Emacs is, of course, not just an editor, but a very versatile platform that people have extended over the years to allow you to read mail, handle appointments, browse the Web, execute shell commands, etc. By merely adding on a smart text-to-speech capability and context-sensitive commands, Raman brilliantly made everything that could be accessed through Emacs accessible to the blind.So, I'm wondering whether the same can be done for the motor disabled. Advantages of this approach are:
-
Designers would no longer have to worry about the mouse, for Emacs allows you to do everything without it.
-
eLocutor wouldn't just be an accessible editor, but would rather make all the capabilities of the computer accessible.
-
I might also find more support this way among the open source developer community, which seems to be far better on platforms historically associated closely with Emacs use than in the MS Windows world.
I am therefore appealing to readers of this chapter to teach me how to extend Emacs such that the same one-button navigation of a tree becomes possible. Better still would be someone wishing to take up this project with whatever help I might be able to provide.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Chapter 31: Emacspeak: The Complete Audio Desktop
- InhaltsvorschauT. V. RamanA desktop is a workspace that one uses to organize the tools of one's trade. Graphical desktops provide rich visual interaction for performing day-to-day computing tasks; the goal of the audio desktop is to enable similar efficiencies in an eyes-free environment. Thus, the primary goal of an audio desktop is to use the expressiveness of auditory output (both verbal and nonverbal) to enable the end user to perform a full range of computing tasks:
-
Communication through the full range of electronic messaging services
-
Ready access to local documents on the client and global documents on the Web
-
Ability to develop software effectively in an eyes-free environment
The Emacspeak audio desktop was motivated by the following insight: to provide effective auditory renderings of information, one needs to start from the actual information being presented, rather than a visual presentation of that information. This had earlier led me to develop AsTeR, Audio System For Technical Readings (http://emacspeak.sf.net/raman/aster/aster-toplevel.html). The primary motivation then was to apply the lessons learned in the context of aural documents to user interfaces—after all, the document is the interface.The primary goal was not to merely carry the visual interface over to the auditory modality, but rather to create an eyes-free user interface that is both pleasant and productive to use.Contrast this with the traditional screen-reader approach where GUI widgets such as sliders and tree controls are directly translated to spoken output. Though such direct translation can give the appearance of providing full eyes-free access, the resulting auditory user interface can be inefficient to use.These prerequisites meant that the environment selected for the audio desktop needed:-
A core set of speech and nonspeech audio output services
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Producing Spoken Output
- InhaltsvorschauI started implementing Emacspeak in October 1994. The target environments were a Linux laptop and my office workstation. To produce speech output, I used a DECTalk Express (a hardware speech synthesizer) on the laptop and a software version of the DECTalk on the office workstation.The most natural way to design the system to leverage both speech options was to first implement a speech server that abstracted away the distinction between the two output solutions. The speech server abstraction has withstood the test of time well; I was able to add support for the IBM ViaVoice engine later, in 1999. Moreover, the simplicity of the client/server API has enabled open source programmers to implement speech servers for other speech engines.Emacspeak speech servers are implemented in the TCL language. The speech server for the DECTalk Express communicated with the hardware synthesizer over a serial line. As an example, the command to speak a string of text was a
procthat took a string argument and wrote it to the serial device. A simplified version of this looks like:proc tts_say {text} {puts -nonewline $tts(write) "$text"}The speech server for the software DECTalk implemented an equivalent, simplifiedtts_sayversion that looks like:proc say {text} {_say "$text"}where_saycalls the underlying C implementation provided by the DECTalk software.The net result of this design was to create separate speech servers for each available engine, where each speech server was a simple script that invoked TCL's default readeval-print loop after loading in the relevant definitions. The client/server API therefore came down to the client (Emacspeak) launching the appropriate speech server, caching this connection, and invoking server commands by issuing appropriate procedure calls over this connection.Notice that so far I have said nothing explicit about how this client/server connection was opened; this late-binding proved beneficial later when it came to making Emacspeak network-aware. Thus, the initial implementation worked by the Emacspeak client communicating to the speech server usingEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Speech-Enabling Emacs
- InhaltsvorschauThe simplicity of the speech server abstraction described above meant that version 0 of the speech server was running within an hour after I started implementing the system. This meant that I could then move on to the more interesting part of the project: producing good quality spoken output. Version 0 of the speech server was by no means perfect; it was improved as I built the Emacspeak speech client.A friend of mine had pointed me at the marvels of Emacs Lisp advice a few weeks earlier. Som when I sat down to speech-enable Emacs, advice was the natural choice. The first task was to have Emacs automatically speak the line under the cursor whenever the user pressed the up/down arrow keys.In Emacs, all user actions invoke appropriate Emacs Lisp functions. In standard editing modes, pressing the down arrow invokes function
next-line, while pressing the up arrow invokesprevious-line. To speech-enable these commands, version 0 of Emacspeak implemented the following rather simple advice fragment:(defadvice next-line (after emacspeak) "Speak line after moving." (when (interactive-p) (emacspeak-speak-line)))
Theemacspeak-speak-linefunction implemented the necessary logic to grab the text of the line under the cursor and send it to the speech server. With the previous definition in place, Emacspeak 0.0 was up and running; it provided the scaffolding for building the actual system.The next iteration returned to the speech server to enhance it with a well-defined eventing loop. Rather than simply executing each speech command as it was received, the speech server queued client requests and provided alaunchcommand that caused the server to execute queued requests.The server used theselectsystem call to check for newly arrived commands after sending each clause to the speech engine. This enabled immediate silencing of speech; with the somewhat naïve implementation described in version 0 of the speech server, the command to stop speech would not take immediate effect since the speech server would first process previously issuedEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Painless Access to Online Information
- InhaltsvorschauWith all the necessary affordances to generate rich auditory output in place, speech-enabling Emacs applications using Emacs Lisp's advice facility requires surprisingly small amounts of specialized code. With the TTS layer and the Emacspeak core handling the complex details of producing good quality output, the speech-enabling extensions focus purely on the specialized semantics of individual applications; this leads to simple and consequently beautiful code. This section illustrates the concept with a few choice examples taken from Emacspeak's rich suite of information access tools.Right around the time I started Emacspeak, a far more profound revolution was taking place in the world of computing: the World Wide Web went from being a tool for academic research to a mainstream forum for everyday tasks. This was 1994, when writing a browser was still a comparatively easy task. The complexity that has been progressively added to the Web in the subsequent 12 years often tends to obscure the fact that the Web is still a fundamentally simple design where:
-
Content creators publish web resources addressable via URIs.
-
URI-addressable content is retrievable via open protocols.
-
Retrieved content is in HTML, a well-understood markup language.
Notice that the basic architecture just sketched out says little to nothing about how the content is made available to the end user. The mid-1990s saw the Web move toward increasingly complex visual interaction. The commercial Web with its penchant for flashy visual interaction increasingly moved away from the simple data-oriented interaction that had characterized early web sites. By 1998, I found that the Web had a lot of useful interactive sites; to my dismay, I also found that I was using progressively fewer of these sites because of the time it took to complete tasks when using spoken output.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- Summary
- InhaltsvorschauEmacspeak was conceived as a full-fledged, eyes-free user interface to everyday computing tasks. To be full-fledged, the system needed to provide direct access to every aspect of computing on desktop workstations. To enable fluent eyes-free interaction, the system needed to treat spoken output and the auditory medium as a first-class citizen—i.e., merely reading out information displayed on the screen was not sufficient.To provide a complete audio desktop, the target environment needed to be an interaction framework that was both widely deployed and fully extensible. To be able to do more than just speak the screen, the system needed to build interactive speech capability into the various applications.Finally, this had to be done without modifying the source code of any of the underlying applications; the project could not afford to fork a suite of applications in the name of adding eyes-free interaction, because I wanted to limit myself to the task of maintaining the speech extensions.To meet all these design requirements, I picked Emacs as the user interaction environment. As an interaction framework, Emacs had the advantage of having a very large developer community. Unlike other popular interaction frameworks available in 1994 when I began the project, it had the significant advantage of being a free software environment. (Now, 12 years later, Firefox affords similar opportunities.)The enormous flexibility afforded by Emacs Lisp as an extension language was an essential prerequisite in speech-enabling the various applications. The open source nature of the platform was just as crucial; even though I had made an explicit decision that I would modify no existing code, being able to study how various applications were implemented made speech-enabling them tractable. Finally, the availability of a high-quality advice implementation for Emacs Lisp (note that Lisp'sEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Acknowledgments
- InhaltsvorschauEmacspeak would not exist without Emacs and the ever-vibrant Emacs developer community that has made it possible to do everything from within Emacs. The Emacspeak implementation would not have been possible without Hans Chalupsky's excellent advice implementation for Emacs Lisp.Project libxslt from the GNOME project has helped breathe fresh life into William Perry's Emacs W3 browser; Emacs W3 was one of the early HTML rendering engines, but the code has not been updated in over eight years. That the W3 code base is still usable and extensible bears testimony to the flexibility and power afforded by Lisp as the implementation language.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 32: Code in Motion
- InhaltsvorschauLaura Wingerd and Christopher SeiwaldThe main point is that every successful piece of software has an extended life in which it is worked on by a succession of programmers and designers….Bjarne StroustrupEarly in the planning of this book, greg wilson asked contributors whether Beautiful Code was an apt title. "Much of what you're going to discuss is software design and architecture, rather than code per se," he wrote us.But this chapter is about the code. It's not about what the code does, nor is it about how beautifully it does it. Instead, this chapter is about how the code looks: specifically, how certain human-visible traits of coding make serial collaboration possible. It's about the beauty of "code in motion."What you're about to read is borrowed largely from Christopher Seiwald's article, "The Seven Pillars of Pretty Code." In a nutshell, the Seven Pillars are:
-
Being "bookish"
-
Making alike look alike
-
Overcoming indentation
-
Disentangling code blocks
-
Commenting code blocks
-
Decluttering
-
Blending in with existing style
While these may sound like mere coding conventions, they're more than that. They're the outward manifestations of a coding practice that keeps product evolution in mind.In this chapter, we'll see how the Seven Pillars have supported a piece of code that has been part of a commercial software system for over 10 years. That piece of code is DiffMerge, a component of the Perforce software configuration management system. DiffMerge's job is to produce a classic three-way merge, comparing two versions of a text file ("leg 1" and "leg 2") to a reference version ("the base"). The output interleaves lines of the input files with placeholders marking the lines that conflict. If you've used Perforce, you've seen DiffMerge at work in theEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. -
- On Being "Bookish"
- Inhaltsvorschau"The Seven Pillars of Pretty Code" describes guidelines we use at Perforce Software. The Seven Pillars aren't the only coding guidelines we use, nor are they applied to all of our development projects. We apply them to components such as DiffMerge where the same code is likely to be active in several concurrently supported releases and modified by many programmers. The effect of the Seven Pillars is to make code more comprehensible to the programmers who read it, in more of the contexts in which they find themselves reading it.Take, for example, the Seven Pillars' advice to be "bookish." Book and magazine text is composed in columns, usually in columns far narrower than the page. Why? Because narrowness reduces the back-and-forth scanning our eyes must do as we read—reading is easier when our eyes work less. Reading is also easier when what we've just read and what we're about to read are both within our visual range. Research shows that as our eyes change focus from word to word, our brains can take cues from surrounding, unfocused shapes. The more our brains can glean "advance warning" from shapes within the visual periphery, the better they're able to direct our eyes for maximum comprehension.Research also seems to show that, when it comes to line lengths of text, there's a difference between reading speed and reading comprehension. Longer lines can be read faster, but shorter lines are easier to comprehend.Chunked text is also easier to comprehend than a continuous column of text. That's why columns in books and magazines are divided into paragraphs. Paragraphs, verses, lists, sidebars, and footnotes are the "transaction markers" of text, saying to our brains, "Have you grokked everything so far? Good, please go on."Code is not strictly text, of course, but for the purpose of human readability, the same principles apply. Bookish code—that is, code formatted in book-like columns and chunks—is easier to comprehend.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Alike Looking Alike
- InhaltsvorschauThe DiffMerge snippet in the previous section also illustrates another principle of writing for comprehensibility: code that is alike looks alike. We see this throughout the DiffMerge code. For example:
while( d.diffs == DD_CONF && ( bf->end != bf->Lines( ) || lf1->end != lf1->Lines( ) || lf2->end != lf2->Lines( ) ) )
The preceding demonstrates how line breaks can create a visual pattern that makes it easier for our brains to recognize a logical pattern. We can tell at a glance that three of the four tests in thiswhilestatement are essentially the same.Here's one more example of alike looking alike. This one illustrates coding that lets our brains do a successful "one of these is not the same" operation:case MS_BASE: /* dumping the original */ if( selbits = selbitTab[ DL_BASE ][ diffDiff ] ) { readFile = bf; readFile->SeekLine( bf->start ); state = MS_LEG1; break; } case MS_LEG1: /* dumping leg1 */ if( selbits = selbitTab[ DL_LEG1 ][ diffDiff ] ) { readFile = lf1; readFile->SeekLine( lf1->start ); state = MS_LEG2; break; } case MS_LEG2: /* dumping leg2 */ if( selbits = selbitTab[ DL_LEG2 ][ diffDiff ] ) { readFile = lf2; readFile->SeekLine( lf2->start ); } state = MS_DIFFDIFF; break;Even if you don't know what this code is about, it's clear to see, for example, thatreadfileandstateare set in all three cases, but only in the third case isstateset unconditionally. The programmer who wrote this paid attention to making alike look alike; those of us reading it later can see at a glance where the essential logic is.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - The Perils of Indentation
- InhaltsvorschauWe've all been taught to use indentation to show the depth of nesting in logical blocks of code. The deeper the nesting, the farther to the right of the page the nested code appears. Formatting code this way is a good idea, but not because it makes the code any easier to read.If anything, deeply indented code is harder to read. Important logic is crowded off to the right, submerged to almost footnote-like insignificance by the layers of if-then-else code that surrounds it, while trivial tests applied in outer blocks seem elevated in importance. So, while indentation is useful for showing where blocks begin and end, it doesn't make the code any easier for us to comprehend.The greater peril is not the indentation, however: it's the nesting. Nested code strains human comprehension, plain and simple. Edward Tufte was not being complimentary when he wrote that "Sometimes [PowerPoint] bullet hierarchies are so complex and intensely nested that they resemble computer code." In Code Complete (see "Further Reading," at the end of this chapter), Steve McConnell warns against using nested
ifstatements—not because they're inefficient or ineffective, but because they're hard on human comprehension. "To understand the code, you have to keep the whole set of nestedifs in your mind at once," he says. It's no surprise that research points to nested conditionals as the most bug-prone of all programming constructs. We have a bit of anecdotal evidence for this, as you'll read in the section "DiffMerge's Checkered Past."So the value of indenting each nesting level is not in making code more comprehensible, but in making the coder more aware of incomprehensibility. "Seven Pillars" advises coders to "overcome indentation"—that is, to code without deep nesting. "This is the most difficult of these practices," admits Seiwald, "as it requires the most artifice and can often influence the implementation of individual functions."Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Navigating Code
- InhaltsvorschauAll Seven Pillars are illustrated in DiffMerge, to one extent or another. For example, the DiffMerge code is constructed of discrete, logical blocks. Each block does a single thing or single kind of thing, and what it does is either self-evident or described in a comment that prefaces the block. Code like this is the result of the Seven Pillars' advice to disentangle and comment code blocks. It's analogous to well-organized expository writing, where definition lists and titled sections help readers navigate densely packed technical information.The lack of clutter also makes the DiffMerge code easier to navigate. One of DiffMerge's clutter-reducing tricks is to use tiny names for variables that are referenced repeatedly throughout the code. This goes against sage advice to use meaningful and descriptive variable names, to be sure. But there's a point at which overuse of long names creates so much clutter that it only impedes comprehension. Writers know this; it's why we use pronouns, surnames, and acronyms in our prose.The comments in DiffMerge are also clutter-free. In 10-year-old code, it's easy to end up with as many comments that describe how the code used to work—along with additional comments about what changed—as comments that describe the current code. But there isn't really any reason to keep a running history of the program's evolution in the code itself; your source control management system has all that information and offers much better ways to track it. (We'll show examples in the next section.) The programmers working on DiffMerge have done a good job keeping the closets clean, as it were. The same goes for code. In DiffMerge, the old code isn't simply commented out, it's gone. The remaining code and comments are uncluttered with the distractions of history.The DiffMerge code also makes liberal use of whitespace. In addition to reducing the appearance of clutter, whitespace increases comprehensibility. When bookish chunks and alike patterns are set off by whitespace, they take on visual shapes our brains can recognize. As we read through the code, our brains index these shapes; later, we unconsciously use these shapes to find code we remember having read.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- The Tools We Use
- InhaltsvorschauCertainly we need code to be understandable when we read the source files directly. We also need the code to be understandable when we encounter it in diffs, merges, patches, debuggers, code inspections, compiler messages, and a variety of other contexts and tools. As it turns out, code written with the Seven Pillars in mind is more readable in more of the tools we use to manage code.For example, DiffMerge code is human-readable even without syntax highlighting. In other words, we don't have to depend on context-sensitive source code editors to read it. It's just as readable when displayed as plain text by debuggers, compilers, and web browsers. Here's a snippet of DiffMerge in gdb:
Breakpoint 3, DiffMerge::DiffDiff (this=0x80e10c0) at diff/diffmerge.cc:510 510 Mode initialMode = d.mode; (gdb)list 505,515 505 DiffGrid d = diffGrid 506 [ df1->StartLeg() == 0 ][ df1->StartBase( ) == 0 ] 507 [ df2->StartLeg() == 0 ][ df2->StartBase( ) == 0 ] 508 [ df3->StartL1() == 0 ][ df3->StartL2( ) == 0 ]; 509 510 Mode initialMode = d.mode; 511 512 // Pre-process rules, typically the outer snake information 513 // contradicts the inner snake, its not perfect, but we use 514 // the length of the snake to determine the best outcome. 515 (gdb) print d $1 = {mode = CONF, diffs = DD_CONF} (gdb)
When working on code that changes as often as DiffMerge (it has been changed 175 times since it was first written), programmers spend considerable time looking at it in diff and merge tools. What these tools have in common is that they restrict the horizontal view of source files, and they add a certain amount of clutter of their own. Code like DiffMerge's is readable even in these conditions. In command-line diffs, its lines don't wrap. In graphical diff tools, we don't have to fiddle with the horizontal scroll bar to see the bulk of its lines, as demonstrates.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - DiffMerge's Checkered Past
- InhaltsvorschauWe have a record of every change, branch, and merge involving DiffMerge throughout its 10-year history. And it's an interesting record. offers a thumbnail view of changes to the released versions of DiffMerge. It shows that DiffMerge originated in the mainline (the lowest line on the diagram) and has been branched into over 20 releases. Changes to DiffMerge have occured in the mainline, for the most part. But the thumbnail shows peculiar activity in some of the most recent releases.
Figure 32-3: DiffMerge's release branchesA count of DiffMerge's patches per release branch, seen in , is even more intriguing. It shows that DiffMerge was rarely patched after it was released—until the 2004.2 release, that is. Then the post-release patch rate soars, only to abate again in the 2006.2 release. Why are releases 2004.2 through 2006.1 so riddled with patches?Here's the backstory: DiffMerge started out as a serviceable but simple program. In its early life, it did little to discriminate between actual merge conflicts and nonconflicting, adjacent-line changes. In 2004, we enhanced DiffMerge to be smarter about detecting and resolving conflicts. As of the 2004.2 release, DiffMerge was certainly more capable, but it was buggy. We got lots of bug reports in 2004.2 and 2005.1—hence the large number of patches. We tried to clean up the code for release 2005.2, but the cleanup resulted in a bug so intractable that we had to restore the 2005.1 version into the 2005.2 release. Then, in 2006, we rewrote the troublesome parts of DiffMerge completely. The rewrite was quite successful, although we did have to tweak it in release 2006.1. Since then, DiffMerge been very stable, and its post-release patch rate is back down to zero.
Figure 32-4: Number of patches applied to DiffMerge per releaseEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Conclusion
- InhaltsvorschauTo a programmer working on code in motion, beauty is code that can be modified with a minimum of fuss. You read the code, determine what it does, and change it. Your success depends as much on how well you understood the code to start with as it does on your ability to code. It also depends on how well your code is understood by the next programmers to tackle it; if you're never called in to help them out, you've done well.Were we to trim the narrative from this chapter, all we'd really have left to say is that the success of code in motion depends on how comprehensible it is to the programmers who read it. But this is not news to anyone.What is news is that programmers read code in diffs, patches, merges, compiler errors, and debuggers—not just in syntax-colored text editors—and that they frequently, if unconsciously, infer logic from the visual appearance of code as well as from the code itself. In other words, there's more to comprehending code than meets the eye.In this chapter, we've examined the effect of using "The Seven Pillars of Pretty Code" as guidelines to make code more comprehensible in more contexts. We've had success with the Seven Pillars. We've used them to write code that can move with the flow of change, and we think that's beautiful.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Acknowledgments
- InhaltsvorschauChristopher Seiwald, James Strickland, Jeff Anton, Mark Mears, Caedmon Irias, Leigh Brasington, and Michael Bishop contributed to DiffMerge. Perforce Software holds the copyright to the DiffMerge source code.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Further Reading
- InhaltsvorschauKim, S., Adaptive Bug Prediction by Analyzing Project History, Ph.D. Dissertation, Department of Computer Science, University of California Santa Cruz, 2006.McConnell, S., Code Complete, Microsoft Press, 1993.McMullin, J., Varnhagen, C. K., Heng, P., and Apedoe, X., "Effects of Surrounding Information and Line Length on Text Comprehension from the Web," Canadian Journal of Learning and Technology, Vol. 28, No. 1, Winter/hiver, 2002.O'Brien, M. P., Software Comprehension - A Review and Direction, Department of Computer Science and Information Systems, University of Limerick, Ireland, 2003.Pan, K., Using Evolution Patterns to Find Duplicated Bugs, Ph.D. Dissertation, Department of Computer Science, University of California Santa Cruz, 2006.Reichle, E.D., Rayner, K., and Pollatsek, A., The E-Z Reader Model of Eye Movement Control in Reading: Comparisons to Other Models, Behavioral and Brain Sciences, Vol. 26, No. 4, 2003.Seiwald, C., "The Seven Pillars of Pretty Code," Perforce Software, 2005, http://www.perforce.com/perforce/papers/prettycode.html.Tufte, Edward R., The Cognitive Style of PowerPoint, Graphics Press LLC, 2004.Whitehead, J., and Kim, S., Predicting Bugs in Code Changes, Google Tech Talks, 2006.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Chapter 33: Writing Programs for "The Book"
- InhaltsvorschauBrian HayesThe mathematician paul erdös often spoke of the book, a legendary volume (not to be found on the shelves of any earthly library) in which are inscribed the best possible proofs of all mathematical theorems. Perhaps there is also a Book for programs and algorithms, listing the best solution to every computational problem. To earn a place in those pages, a program must be more than just correct; it must also be lucid, elegant, concise, even witty.We all strive to create such gems of algorithmic artistry. And we all struggle, now and then, with a stubborn bit of code that just won't shine, no matter how hard we polish it. Even if the program produces correct results, there's something strained and awkward about it. The logic is a tangle of special cases and exceptions to exceptions; the whole structure seems brittle and fragile. Then, unexpectedly, inspiration strikes, or else a friend from down the hall shows you a new trick, and suddenly you've got one for The Book.In this chapter I tell the story of one such struggle. It's a story with a happy ending, although I'll leave it to readers to decide whether the final program deserves a place in The Book. I wouldn't be brash enough even to suggest the possibility except that this is one of those cases where the crucial insight came not from me but from a friend down the hall—or, rather, from a friend across the continent.The program I'll be talking about comes from the field of computational geometry, which seems to be peculiarly rich in problems that look simple on first acquaintance but turn out to be real stinkers when you get into the details. What do I mean by computational geometry? It's not the same as computer graphics, although there are close connections. Algorithms in computational geometry live not in the world of pixels but in that idealized ruler-and-compass realm where points are dimensionless, lines have zero thickness, and circles are perfectly round. Getting exact results is often essential in these algorithms, because even the slightest inaccuracy can utterly transform the outcome of a computation. Changing a digit far to the right of the decimal point might turn the world inside out.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- The Nonroyal Road
- InhaltsvorschauThe program I'll be talking about comes from the field of computational geometry, which seems to be peculiarly rich in problems that look simple on first acquaintance but turn out to be real stinkers when you get into the details. What do I mean by computational geometry? It's not the same as computer graphics, although there are close connections. Algorithms in computational geometry live not in the world of pixels but in that idealized ruler-and-compass realm where points are dimensionless, lines have zero thickness, and circles are perfectly round. Getting exact results is often essential in these algorithms, because even the slightest inaccuracy can utterly transform the outcome of a computation. Changing a digit far to the right of the decimal point might turn the world inside out.Euclid supposedly told a princely student, "There is no royal road to geometry." Among the nonroyal roads, the computational pathway is notably muddy, rutty, and potholed. The difficulties met along the way sometimes have to do with computational efficiency: keeping a program's running time and memory consumption within reasonable bounds. But efficiency is not the main issue with the geometric algorithms that concern me here; instead, the challenges are conceptual and aesthetic. Can we get it right? Can we make it beautiful?The program presented below in several versions is meant to answer a very elementary question: given three points in the plane, do all of the points lie along the same line? This is such a simple-sounding problem, it ought to have a simple solution as well. A few months ago, when I needed a routine to answer the collinearity question (as a component of a larger program), the task looked so straightforward that I didn't even pause to consult the literature and see how others might have solved it. I don't exactly regret that haste—wrestling with the problem on my own must have taught me something, or at least built some character—but I admit it was not the royal road to the right answer. I wound up repeating the steps of many who went before me. (Maybe that's why the road is so rutted!)Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Warning to Parenthophobes
- InhaltsvorschauI present the code in Lisp. I'm not going to apologize for my choice of programming language, but neither do I want to turn this chapter into a tract for recruiting Lisp converts. I'll just say that I believe multilingualism is a good thing. If reading the code snippets below teaches you something about an unfamiliar language, the experience will do you no harm. All of the procedures are very short—half a dozen lines or so. offers a thumbnail guide to the structure of a Lisp procedure.Incidentally, the algorithm implemented by the program in the figure is surely in The Book. It is Euclid's algorithm for calculating the greatest common divisor of two numbers.
Figure 33-1: Bits and pieces of a Lisp procedure definitionEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Three in a Row
- InhaltsvorschauIf you were working out a collinearity problem with pencil and paper, how would you go about it? One natural approach is to plot the positions of the three points on graph paper, and then, if the answer isn't obvious by inspection, draw a line through two of the points and see whether the line passes through the third point (see ). If it's a close call, accuracy in placing the points and drawing the line becomes critical.
Figure 33-2: Three noncollinear pointsA computer program can do the same thing, although for the computer nothing is ever "obvious by inspection." To draw a line through two points, the program derives the equation of that line. To see whether the third point lies on the line, the program tests whether or not the coordinates of the point satisfy the equation. (Exercise: For any set of three given points, there are three pairs of points you could choose to connect, in each case leaving a different third point to be tested for collinearity. Some choices may make the task easier than others, in the sense that less precision is needed. Is there some simple criterion for making this decision?)The equation of a line takes the form y=mx+b, where m is the slope and b is the y-intercept, the point (if there is one) where the line crosses the y-axis. So, given three points p, q, and r, you want to find the values of m and b for the line that passes through two of them, and then test the x-and y-coordinates of the third point to see if the same equation holds.Here's the code:(defun naive-collinear (px py qx qy rx ry) (let ((m (slope px py qx qy)) (b (y-intercept px py qx qy))) (= ry (+ (* m rx) b))))
The procedure is a predicate: it returns a Boolean value of true or false (in Lisp argot,tornil). The six arguments are the x-and y-coordinates of the pointsEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - The Slippery Slope
- InhaltsvorschauInstead of drawing a line through two points and seeing whether the third point is on the line, suppose I drew all three lines and checked to see whether they are really the same line. Actually, I would need to draw only two of the lines, because if line
is identical to line
, it must also be equal to
. Furthermore, it turns out I need to compare only the
slopes, not the y-intercepts. (Do you see why?)
Judging by eye whether two lines are really coincident or form a narrow
scissors angle might not be the most reliable procedure, but in the
computational world, it comes down to a simple comparison
of two numbers, the m values (see ).
Figure 33-3: Testing collinearity by comparing slopesI wrote a new version ofcollinearas follows:(defun mm-collinear (px py qx qy rx ry) (equalp (slope px py qx qy) (slope qx qy rx ry)))
What an improvement! This looks much simpler. There's noifexpression calling attention to the distinguished status of vertical lines; all sets of points are treated the same way.I must confess, however, that the simplicity and the apparent uniformity are an illusion, based on some Lisp trickery going on behind the scenes. Note that I compare the slopes not with=but with the generic equality predicateequalp. The procedure works correctly only becauseequalphappens to do the right thing whetherslopereturns a number ornil. (That is, the two slopes are considered equal if they are both the same number or if they are bothnil.) In a language with a fussier type system, the definition would not be so sweetly concise. It would have to look something like this:(defun typed-mm-collinear (px py qx qy rx ry) (let ((pq-slope (slope px py qx qy)) (qr-slope (slope qx qy rx ry))) (or (and (numberp pq-slope) (numberp qr-slope) (= pq-slope qr-slope)) (and (not pq-slope) (not qr-slope)))))
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - The Triangle Inequality
- InhaltsvorschauHere's yet another way of rethinking the problem. Observe that if p, q, and r are not collinear, they define a triangle (). It's a property of any triangle that the longest side is shorter than the sum of the smaller sides. If the three points are collinear, however, the triangle collapses on itself, and the longest "side" is exactly equal to the sum of the smaller "sides."
Figure 33-4: Testing collinearity by the triangle inequality(For the example shown in this figure, the long side is shorter than the sum of the other two sides by about 0.067.)The code for this version of the function is not quite so compact as the others, but what's going on inside is simple enough:(defuntriangle-collinear (px py qx qy rx ry) (let ((pq (distance px py qx qy)) (pr (distance px py rx ry)) (qr (distance qx qy rx ry))) (let ((sidelist (sort (list pq pr qr) #'>))) (= (first sidelist) (+ (second sidelist) (third sidelist))))))
The idea is to calculate the three side lengths, put them in a list, sort them in descending order of magnitude, and then compare the first (longest) side with the sum of the other two. If and only if these lengths are equal are the points collinear. This approach has a lot to recommend it. The calculation depends only on the geometric relations among the points themselves; it's independent of their position and orientation on the plane. Slopes and intercepts are not even mentioned. As a bonus, this version of the procedure also gives consistent and sensible answers when two or three of the points are coincident: all such point sets are considered collinear.Unfortunately, there is a heavy price to be paid for this simplicity. Up to this point, all computations have been done with exact arithmetic. If the original coordinates are specified by means of integers or rational numbers, then the slopes and intercepts are calculated without round-off or other error. For example, the line passing through (1 1) and (4 2) has slopeEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Meandering On
- InhaltsvorschauTo tell the rest of this story, I need to mention the context in which it took place. Some months ago I was playing with a simple model of river meandering—the formation of those giant horseshoe bends you see in the Lower Mississippi. The model decomposed the smooth curve of the river's course into a chain of short, straight segments. I needed to measure curvature along the river in terms of the bending angles between these segments, and in particular, I wanted to detect regions of zero curvature—hence the collinearity predicate.Another part of the program gave me even more trouble. As meanders grow and migrate, one loop sometimes runs into the next one, at which point the river takes a shortcut and leaves behind a stranded "oxbow" lake. (You don't want to be standing in the way when this happens on the Mississippi.) To detect such events in the model, I needed to scan for intersections of segments. Again, I was able to get a routine working, but it seemed needlessly complex, with a decision tree sprouting a dozen branches. As in the case of collinearity, vertical segments and coincident points required special handling, and I also had to worry about parallel segments.For the intersection problem, I eventually spent some time in the library and checked out what the Net had to offer. I learned a lot. That's where I found the tip that 1010 is close enough to infinity. And Bernard Chazelle and Herbert Edelsbrunner suggested a subtler way of finessing the singularities and degeneracies I had run into. In a 1992 review article on line-segment intersection algorithms (see the "Further Reading" section at the end of this chapter), they wrote:For the ease of exposition, we shall assume that no two endpoints have the same x-or y-coordinates. This, in particular, applies to the two endpoints of the same segment, and thus rules out the presence of vertical or horizontal segments…Our rationale is that the key ideas of the algorithm are best explained without having to worry about special cases every step of the way. Relaxing the assumptions is very easy (no new ideas are required) but tedious. That's for the theory. Implementing the algorithm so that the program works in all cases, however, is a daunting task. There are also numerical problems that alone would justify writing another paper. Following a venerable tradition, however, we shall try not to worry too much about it.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- "Duh!"—I Mean "Aha!"
- InhaltsvorschauIn cartoons, the moment of discovery is depicted as a light bulb turning on in a thought balloon. In my experience, that sudden flash of understanding feels more like being thumped in the back of the head with a two-by-four. When you wake up afterwards, you've learned something, but by then your new insight is so blindingly obvious that you can't quite believe you didn't know it all along. After a few days more, you begin to suspect that maybe you did know it; you must have known it; you just needed reminding. And when you pass the discovery along to the next person, you'll begin, "As everyone knows…."That was my reaction on reading Jonathan Shewchuk's "Lecture Notes on Geometric Robustness." He gives a collinearity algorithm that, once I understood it, seemed so natural and sensible that I'm sure it must have been latent within me somewhere. The key idea is to work with the area of a triangle rather than the perimeter, as in
triangle-collinear. Clearly, the area of a triangle is zero if and only if the triangle is a degenerate one, with collinear vertices. But measuring a function of the area rather a function of the perimeter has two big advantages. First, it can be done without square roots or other operations that would take us outside the field of rational numbers. Second, it is much less dependent on numerical precision.Consider a family of isosceles triangles with vertices (0 0), (x 1), and (2x 0). As x increases, the difference between the length of the base and the sum of the lengths of the two legs gets steadily smaller, and so it becomes difficult to distinguish this very shallow triangle from a totally flattened one with vertices (0 0), (x 0), and (2x 0). The area calculation doesn't suffer from this problem. On the contrary, the area grows steadily as the triangle gets longer (see ). Numerically, even without exact arithmetic, the computation is much more robust.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar. - Conclusion
- InhaltsvorschauMy adventures and misadventures searching for the ideal collinearity predicate do not make a story with a tidy moral. In the end, I believe I stumbled upon the correct solution to my specific problem, but the larger question of how best to find such solutions in general remains unsettled.One lesson that might be drawn from my experience is to seek help without delay: some-body out there knows more than you do. You may as well take advantage of the cumulative wisdom of your peers and predecessors. In other words, Google can probably find the algorithm you want, or even the source code, so why waste time reinventing it?I have mixed feelings about this advice. When an engineer is designing a bridge, I expect her to have a thorough knowledge of how others in the profession have solved similar problems in the past. Yet expertise is not merely skill in finding and applying other people's bright ideas; I want my bridge designer to have solved a few problems on her own as well.Another issue is how long to keep an ailing program on life support. In this chapter, I have been discussing the tiniest of programs, so it cost very little to rip it up and start over whenever I encountered the slightest unpleasantness. For larger projects, the decision to throw one away is never so easy. And doing so is not necessarily prudent: you are trading known problems for unknown ones.Finally, there is the question of just how much the quest for "beautiful code" should be allowed to influence the process of programming or software development. The mathematician G. H. Hardy proclaimed, "There is no permanent place in the world for ugly mathematics." Do aesthetic principles carry that much weight in computer science as well? Here's another way of asking the same question: do we have any guarantee that a Book-quality program exists for every well-formulated computational problem? MaybeEnde der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Further Reading
- InhaltsvorschauAvnaim, F., J.-D. Boissonnat, O. Devillers, F. P. Preparata, and M. Yvinec. "Evaluating signs of determinants using single-precision arithmetic." Algorithmica, Vol. 17, pp. 111–132, 1997.Bentley, Jon L., and Thomas A. Ottmann. "Algorithms for reporting and counting geometric intersections." IEEE Transactions on Computers, Vol. C-28, pp. 643–647, 1979.Braden, Bart. "The surveyor's area formula." The College Mathematics Journal, Vol. 17, No. 4, pp. 326–337, 1986.Chazelle, Bernard, and Herbert Edelsbrunner. "An optimal algorithm for intersecting line segments in the plane." Journal of the Association for Computing Machinery, Vol. 39, pp. 1–54, 1992.Forrest, A. R. "Computational geometry and software engineering: Towards a geometric computing environment." In Techniques for Computer Graphics (edited by D. F. Rogers and R. A. Earnshaw), pp. 23–37. New York: Springer-Verlag, 1987.Forrest, A. R. "Computational geometry and uncertainty." In Uncertainty in Geometric Computations (edited by Joab Winkler and Mahesan Niranjan), pp. 69–77. Boston: Kluwer Academic Publishers, 2002.Fortune, Steven, and Christopher J. Van Wyk. "Efficient exact arithmetic for computational geometry." In Proceedings of the Ninth Annual Symposium on Computational Geometry, pp. 163–172. New York: Association for Computing Machinery, 1993.Guibas, Leonidas, and Jorge Stolfi. "Primitives for the manipulation of general subdivisions and the computation of Voronoi diagrams." ACM Transactions on Graphics, Vol. 4, No. 2, pp. 74–123, 1985.Hayes, Brian. "Only connect!" http://bit-player.org/2006/only-connect. [Weblog item, posted September 14, 2006.]Hayes, Brian. "Computing science: Up a lazy river." American Scientist, Vol. 94, No. 6, pp. 490–494, 2006. (http://www.americanscientist.org/AssetDetail/assetid/54078)Hoffmann, Christoph M., John E. Hopcroft and Michael S. Karasick. "Towards implementing robust geometric computations."Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Appendix : Afterword
- InhaltsvorschauAndy OramBeautiful code surveys the range of human invention and ingenuity in one area of endeavor: the development of computer systems. The beauty in each chapter comes from the discovery of unique solutions, a discovery springing from the authors' power to look beyond set boundaries, to recognize needs overlooked by others, and to find surprising solutions to troubling problems.Many of the authors confronted limitations—in the physical environment, in the resources available, or in the very definition of their requirements—that made it hard even to imagine solutions. Others entered domains where solutions already existed, but brought in a new vision and a conviction that something much better could be achieved.All the authors in this book have drawn lessons from their projects. But we can also draw some broader lessons after making the long and eventful journey through the whole book.First, there are times when tried and true rules really do work. So, often one encounters difficulties when trying to maintain standards for robustness, readability, or other tenets of good software engineering. In such situations, it is not always necessary to abandon the principles that hold such promise. Sometimes, getting up and taking a walk around the problem can reveal a new facet that allows one to meet the requirements without sacrificing good technique.On the other hand, some chapters confirm the old cliché that one must know the rules before one can break them. Some of the authors built up decades of experience before taking a different path toward solving one thorny problem—and this experience gave them the confidence to break the rules in a constructive way.On the other hand, cross-disciplinary studies are also championed by the lessons in this book. Many authors came into new domains and had to fight their way in relative darkness. In these situations, a particularly pure form of creativity and intelligence triumphed.Finally, we learn from this book that beautiful solutions don't last for all time. New circumstances always require a new look. So, if you read the book and thought, "I can't use these authors' solutions on any of my own projects," don't worry—next time these authors have projects, they will use different solutions, too.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
- Appendix : Contributors
- InhaltsvorschauJon Bentley is a computer scientist at avaya labs research. His research interests include programming techniques, algorithm design, and the design of software tools and interfaces. He has written books on programming, and articles on a variety of topics, ranging from the theory of algorithms to software engineering. He received a B.S. from Stanford in 1974, and an M.S. and Ph.D. from the University of North Carolina in 1976, then taught Computer Science at Carnegie Mellon University for six years. He joined Bell Labs Research in 1982, and retired in 2001 to join Avaya. He has been a visiting faculty member at West Point and Princeton, and he has been a member of teams that have shipped software tools, telephone switches, telephones, and web services.Tim Bray managed the Oxford English Dictionary project at the University of Waterloo in Ontario, Canada in 1987–1989, co-founded Open Text Corporation in 1989, launched one of the first public web search engines in 1995, co-invented XML 1.0 and co-edited "Namespaces in XML" between 1996 and 1999, founded Antarctica Systems in 1999, and served as a Tim Berners-Lee appointee on the W3C Technical Architecture Group in 2002–2004. Currently, he serves as Director of Web Technologies at Sun Microsystems, publishes a popular weblog, and co-chairs the IETF AtomPub Working Group.Bryan Cantrill is a Distinguished Engineer at Sun Microsystems, where he has spent most of his career working on the Solaris kernel. Most recently he, along with colleagues Mike Shapiro and Adam Leventhal, designed and implemented DTrace, a facility for dynamic instrumentation of production systems that won the Wall Street Journal's top award for innovation in 2006.Douglas Crockford is a product of our public school system. A registered voter, he owns his own car. He has developed office automation systems. He did research in games and music at Atari. He was Director of Technology at Lucasfilm Ltd. He was Director of New Media at Paramount. He was the founder and CEO of Electric Communities. He was founder and CTO of State Software, where he discovered JSON. He is now an architect at Yahoo! Inc.Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Zurück zu Beautiful Code
