JETZT ONLINE BESTELLEN
Add to Cart
Advanced Rails

First Edition Januar 2008
ISBN 978-0-596-51032-9
357 Seiten
EUR29.00

Weitere Informationen zu diesem Buch

Inhaltsverzeichnis | Index | Probekapitel | Kolophon | Rezensionen |


Inhaltsverzeichnis

	
Chapter 1: Foundational Techniques
Inhaltsvorschau
Simplicity is prerequisite for reliability.
—Edsger W. Dijkstra
Since its initial release in July 2004, the Ruby on Rails web framework has been steadily growing in popularity. Rails has been converting PHP, Java, and .NET developers to a simpler way: a model-view-controller (MVC) architecture, sensible defaults ("convention over configuration"), and the powerful Ruby programming language.
Rails had somewhat of a bad reputation for a lack of documentation during its first year or two. This gap has since been filled by the thousands of developers who use, contribute to, and write about Ruby on Rails, as well as by the Rails Documentation project (http://railsdocumentation.org/). There are hundreds of blogs that offer tutorials and advice for Rails development.
This book's goal is to collect and distill the best practices and knowledge embodied by the community of Rails developers and present everything in an easy-to-understand, compact format for experienced programmers. In addition, I seek to present facets of web development that are often undertreated or dismissed by the Rails community.
Rails brought metaprogramming to the masses. Although it was certainly not the first application to use Ruby's extensive facilities for introspection, it is probably the most popular. To understand Rails, we must first examine the parts of Ruby that make Rails possible. This chapter lays the foundation for the techniques discussed in the remainder of this book.
Metaprogramming is a programming technique in which code writes other code or introspects upon itself. The prefix meta-(from Greek) refers to abstraction; code that uses metaprogramming techniques works at two levels of abstraction simultaneously.
Metaprogramming is used in many languages, but it is most popular in dynamic languages because they typically have more runtime capabilities for manipulating code as data. Though reflection is available in more static languages such as C# and Java, it is not nearly as transparent as in the more dynamic languages such as Ruby because the code and data are on two separate levels at runtime.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
What Is Metaprogramming?
Inhaltsvorschau
Rails brought metaprogramming to the masses. Although it was certainly not the first application to use Ruby's extensive facilities for introspection, it is probably the most popular. To understand Rails, we must first examine the parts of Ruby that make Rails possible. This chapter lays the foundation for the techniques discussed in the remainder of this book.
Metaprogramming is a programming technique in which code writes other code or introspects upon itself. The prefix meta-(from Greek) refers to abstraction; code that uses metaprogramming techniques works at two levels of abstraction simultaneously.
Metaprogramming is used in many languages, but it is most popular in dynamic languages because they typically have more runtime capabilities for manipulating code as data. Though reflection is available in more static languages such as C# and Java, it is not nearly as transparent as in the more dynamic languages such as Ruby because the code and data are on two separate levels at runtime.
Introspection is typically done on one of two levels. Syntactic introspection is the lowest level of introspection—direct examination of the program text or token stream. Template-based and macro based metaprogramming usually operate at the syntactic level.
Lisp encourages this style of metaprogramming by using S-expressions (essentially a direct translation of the program's abstract syntax tree) for both code and data. Metaprogramming in Lisp heavily involves macros, which are essentially templates for code. This offers the advantage of working on one level; code and data are both represented in the same way, and the only thing that distinguishes code from data is whether it is evaluated. However, there are some drawbacks to metaprogramming at the syntactic level. Variable capture and inadvertent multiple evaluation are direct consequences of having code on two levels of abstraction in the source evaluated in the same namespace. Although there are standard Lisp idioms for dealing with these problems, they represent more things the Lisp programmer must learn and think about.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Ruby Foundations
Inhaltsvorschau
This book relies heavily on a firm understanding of Ruby. This section will explain some aspects of Ruby that are often confusing or misunderstood. Some of this may be familiar, but these are important concepts that form the basis for the metaprogramming techniques covered later in this chapter.
Classes and modules are the foundation of object-oriented programming in Ruby. Classes facilitate encapsulation and separation of concerns. Modules can be used as mixins—bundles of functionality that are added onto a class to add behaviors in lieu of multiple inheritance. Modules are also used to separate classes into namespaces.
In Ruby, every class name is a constant. This is why Ruby requires class names to begin with an uppercase letter. The constant evaluates to the class object, which is an object of the class Class. This is distinct from the Class object, which represents the actual class Class. When we refer to a "class object" (with a lowercase C), we mean any object that represents a class (including Class itself). When we refer to the "Class object" (uppercase C), we mean the class Class, which is the superclass of all class objects.
The class Class inherits from Module; every class is also a module. However, there is an important distinction. Classes cannot be mixed in to other classes, and classes cannot extend objects; only modules can.
Method lookup in Ruby can be very confusing, but it is quite regular. The easiest way to understand complicated situations is to visualize the data structures that Ruby creates behind the scenes.
Every Ruby object has a set of fields in memory:
klass
A pointer to the class object of this object. (It is klass instead of class because the latter is a reserved word in C++ and Ruby; if it were called class, Ruby would compile with a C compiler but not with a C++ compiler. This deliberate misspelling is used everywhere in Ruby.)
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Metaprogramming Techniques
Inhaltsvorschau
Now that we've covered the fundamentals of Ruby, we can examine some of the common metaprogramming techniques that are used in Rails.
Although we write examples in Ruby, most of these techniques are applicable to any dynamic programming language. In fact, many of Ruby's metaprogramming idioms are shamelessly stolen from either Lisp, Smalltalk, or Perl.
Often we want to create an interface whose methods vary depending on some piece of runtime data. The most prominent example of this in Rails is ActiveRecord's attribute accessor methods. Method calls on an ActiveRecord object (like person.name) are translated at runtime to attribute accesses. At the class-method level, ActiveRecord offers extreme flexibility: Person.find_all_by_user_id_and_active(42, true) is translated into the appropriate SQL query, raising the standard NoMethodError exception should those attributes not exist.
The magic behind this is Ruby's method_missing method. When a nonexistent method is called on an object, Ruby first checks that object's class for a method_missing method before raising a NoMethodError. method_missing's first argument is the name of the method called; the remainder of the arguments correspond to the arguments passed to the method. Any block passed to the method is passed through to method_missing. So, a complete method signature is:

	def method_missing(method_id, *args, &block)

	  ...

	end

There are several drawbacks to using method_missing:
  • It is slower than conventional method lookup. Simple tests indicate that method dispatch with method_missing is at least two to three times as expensive in time as conventional dispatch.
  • Since the methods being called never actually exist—they are just intercepted at the last step of the method lookup process—they cannot be documented or introspected as conventional methods can.
  • Because all dynamic methods must go through the method_missing method, the body of that method can become quite large if there are many different aspects of the code that need to add methods dynamically.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Functional Programming
Inhaltsvorschau
The paradigm of functional programming focuses on values rather than the side effects of evaluation. In contrast to imperative programming, the functional style deals with the values of expressions in a mathematical sense. Function application and composition are first-class concepts, and mutable state (although it obviously exists at a low level) is abstracted away from the programmer.
This is a somewhat confusing concept, and it is often unfamiliar even to experienced programmers. The best parallels are drawn from mathematics, from which functional programming is derived.
Consider the mathematical equation x = 3. The equals sign in that expression indicates equivalence: "x is equal to 3." On the contrary, the Ruby statement x = 3 is of a completely different nature. That equals sign denotes assignment: "assign 3 to x." In a functional programming language, equals usually denotes equality rather than assignment. The key difference here is that functional programming languages specify what is to be calculated; imperative programming languages tend to specify how to calculate it.
The cornerstone of functional programming, of course, is functions. The primary way that the functional paradigm influences mainstream Ruby programming is in the use of higher-order functions (also called first-class functions, though these two terms are not strictly equivalent). Higher-order functions are functions that operate on other functions. Higher-order functions usually either take one or more functions as an argument or return a function.
Ruby supports functions as mostly first-class objects; they can be created, manipulated, passed, returned, and called. Anonymous functions are represented as Proc objects, created with Proc.new or Kernel#lambda:

	add = lambda{|a,b| a + b}

	add.class # => Proc

	add.arity # => 2



	# call a Proc with Proc#call

	add.call(1,2) # => 3



	# alternate syntax

	add[1,2] # => 3

The most common use for blocks in Ruby is in conjunction with iterators. Many programmers who come to Ruby from other, more imperative-style languages start out writing code like this:
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Examples
Inhaltsvorschau
This example ties together several of the techniques we have seen in this chapter. We return to the Person example, where we want to time several expensive methods:

	class Person

	  def refresh

	    # ... 

	  end



	  def dup 

	    # ... 

	  end

	end

In order to deploy this to a production environment, we may not want to leave our timing code in place all of the time because of overhead. However, we probably want to have the option to enable it when debugging. We will develop code that allows us to add and remove features (in this case, timing code) at runtime without touching the original source.
First, we set up methods wrapping each of our expensive methods with timing commands. As usual, we do this by monkeypatching the timing methods into Person from another file to separate the timing code from the actual model logic: .

	class Person

	  TIMED_METHODS = [:refresh, :dup]

	  TIMED_METHODS.each do |method|

	    # set up _without_timing alias of original method

	    alias_method :"#{method}_without_timing", method



	    # set up _with_timing method that wraps the original in timing code

	    define_method :"#{method}_with_timing" do

	      start_time = Time.now.to_f

	      returning(self.send(:"#{method}_without_timing")) do

	        end_time = Time.now.to_f



	        puts "#{method}: #{"%.3f" % (end_time-start_time)} s."

	      end

	    end

	  end

	end

We add singleton methods to Person to enable or disable tracing:

	class << Person

	  def start_trace

	    TIMED_METHODS.each do |method|

	      alias_method method, :"#{method}_with_timing"

	    end

	  end



	  def end_trace

	    TIMED_METHODS.each do |method|

	      alias_method method, :"#{method}_without_timing"

	    end

	  end

	end

To enable tracing, we wrap each method call in the timed method call. To disable it, we simply point the method call back to the original method (which is now only accessible by its _without_timing alias).
To use these additions, we simply call the Person.trace method:

	p = Person.new

	p.refresh # => (...)



	Person.start_trace

	p.refresh # => (...)

	# -> refresh: 0.500 s.



	Person.end_trace

	p.refresh # => (...)

Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Further Reading
Inhaltsvorschau
Minero AOKI's Ruby Hacking Guide is an excellent introduction to Ruby's internals. It is being translated into English at http://rhg.rubyforge.org/.
Eigenclass (http://eigenclass.org/) has several more technical articles on Ruby.
Evil.rb is a library for accessing the internals of Ruby objects. It can change objects' internal state, traverse and examine the klass and super pointers, change an object's class, and cause general mayhem. Use with caution. It is available at http:// rubyforge.org/projects/evil/. Mauricio Fernández gives a taste of Evil at http://eigenclass. org/hiki.rb?evil.rb+dl+and+unfreeze.
Jamis Buck has a very detailed exploration of the Rails routing code, as well as several other difficult parts of Rails, at http://weblog.jamisbuck.org/under-the-hood.
One of the easiest-to-understand, most well-architectured pieces of Ruby software I have seen is Capistrano 2, also developed by Jamis Buck. Not only does Capistrano have a very clean API, it is extremely well built from the bottom up. If you haven't been under Capistrano's hood, it will be well worth your time. The source is available via Subversion from http://svn.rubyonrails.org/rails/tools/capistrano/.
Mark Jason Dominus's book Higher-Order Perl (Morgan Kaufmann Publishers) was revolutionary in introducing functional programming concepts into Perl. When Higher-Order Perl was released in 2005, Perl was a language not typically known for its functional programming support. Most of the examples in the book can be translated fairly readily into Ruby; this is a good exercise if you are familiar with Perl. James Edward Gray II has written up his version in his "Higher-Order Ruby" series, at http://blog.grayproductions.net/categories/higherorder_ruby.
The Ruby Programming Language, by David Flanagan and Yukihiro Matsumoto (O'Reilly), is a book covering both Ruby 1.8 and 1.9. It is due out in January 2008. The book includes a section on functional programming techniques in Ruby.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Chapter 2: ActiveSupport and RailTies
Inhaltsvorschau
[Programs] must be written for people to read, and only incidentally for machines to execute.
—H. Abelson and G. Sussmann
Structure and Interpretation of Computer Programs, MIT Press, 1985
We continue in our bottom-up view of Rails by examining the pieces that form the basis for Rails. ActiveSupport is a library that provides generic, reusable functions that are not specific to any one part of Rails. We can use many of these methods ourselves when writing our application code. RailTies is the other half, containing parts that glue Rails together in a Rails-specific way. Although we will not usually use RailTies functions in our own code, it is important and instructive to examine them.
Most of this chapter is nonsequential; feel free to skip around. However, in accordance with our bottom-up approach to Rails, later chapters will build on this material.
It is very easy to overlook some of Ruby's more useful methods. The best way to find them is to read code. Here are some of the more obscure, but helpful, ones.
  • Array#* an operate as Array#join (if given a string or stringlike argument); it also does repetition:
    
    	[1, 2, 3] * "; " # => "1; 2; 3"
    
    
    
    	[0] * 5 # => [0, 0, 0, 0, 0]
    
    
  • Array#pack and String#unpack are useful for working with binary files. why the lucky stiff uses Array#pack to stuff a series of numbers into a BMP-formatted sparkline graph without any heavy image libraries, in 13 lines of code (http://redhanded.hobix.com/inspect/sparklinesForMinimalists.html).
  • 
    	Dir.[] is shorthand for Dir.glob:
    
    	    Dir["/s*"] # => ["/scripts", "/srv", "/selinux", "/sys", "/sbin"]
    
    
  • Enumerable#all? returns true if the given block returns a true value for all items in the enumerable. Similarly, Enumerable#any? returns true if the block returns a
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Ruby You May Have Missed
Inhaltsvorschau
It is very easy to overlook some of Ruby's more useful methods. The best way to find them is to read code. Here are some of the more obscure, but helpful, ones.
  • Array#* an operate as Array#join (if given a string or stringlike argument); it also does repetition:
    
    	[1, 2, 3] * "; " # => "1; 2; 3"
    
    
    
    	[0] * 5 # => [0, 0, 0, 0, 0]
    
    
  • Array#pack and String#unpack are useful for working with binary files. why the lucky stiff uses Array#pack to stuff a series of numbers into a BMP-formatted sparkline graph without any heavy image libraries, in 13 lines of code (http://redhanded.hobix.com/inspect/sparklinesForMinimalists.html).
  • 
    	Dir.[] is shorthand for Dir.glob:
    
    	    Dir["/s*"] # => ["/scripts", "/srv", "/selinux", "/sys", "/sbin"]
    
    
  • Enumerable#all? returns true if the given block returns a true value for all items in the enumerable. Similarly, Enumerable#any? returns true if the block returns a true value for any item.
    
    	(1..10).all?{|i| i > 0 && i < 15} # => true
    
    
    
    	(1..10).any?{|i| i*i == 9} # => true
    
    	(1..10).any?{|i| i*i == 8} # => false
    
    
  • Enumerable#grep filters an enumerable against another object using ===, affording all of the usual flexibility of the === method:
    
    	[1, 2, 3].methods.grep(/^so/) # => ["sort!", "sort", "sort_by"]
    
    
    
    	[1, :two, "three", 4].grep(Fixnum) # => [1, 4]
    
    
  • Enumerable#sort_by sorts the enumerable by the value of the given block, by performing a Schwartzian transform on the data. It builds up a set of input elements, each stored with the result of applying the block to that element. Because the block should return the same value when called with the same input, it only needs to be called once per input. Thus, O(n) calculations are done rather than O(n lg n).
    However, the sort_by
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
How to Read Code
Inhaltsvorschau
As implied by the quote introducing this chapter, the primary purpose of source code should not be expressing implementation to a computer; it should be expressing meaning to people. Programming languages are an incredibly expressive and terse medium for the concepts programmers talk about. Proposals to make programming languages more English-like inevitably fail not because of poor implementation but because there is an inherent impedance mismatch between the domains of English language and computer programming.
Thus, computer programming languages should be compared not by their levels of raw power (any Turing-complete language trivially satisfies this requirement) or speed of execution (for most applications, speed is not critical) but by their programmer efficiency—the speed at which a programmer can accurately translate his thoughts into code.
Closely related to programmer efficiency is maintainer efficiency: the ability of a maintainer (who maybe the original developer, 12 months later) to read the code and deduce what is going on. Perl is often criticized for being "write-only"; it is easy to write code that is nearly unreadable to future developers. Such code would have high programmer efficiency at the cost of maintainer efficiency.
Ruby wins on both fronts: most Rubycode is easy to write and read, once you know the basic syntax and semantics. Still, diving into any large project such as Rails is difficult. Here, we discuss ways to begin reading a codebase.
One disadvantage of the dynamic nature of Ruby is that there is little opportunity for development-time reflection on Ruby code. When using a more static language, IDEs can infer the type of variables, and from that deduce the methods available to those variables. Thus, they can offer assistance in coding by suggesting variable and method names. In Ruby, in principle the only way to know the methods available to an object is to evaluate the expression returning that object's value. This is clearly impractical due to side effects of evaluation or differences in development and execution environment. The effect is that it is impossible to write a general code-completion-style IDE for Ruby.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
ActiveSupport
Inhaltsvorschau
ActiveSupport is the libraryof utilitymethods that Rails uses. We examine them in detail here for two reasons. First, theycan be useful to our application code—we can directlyuse manyof these libraries and methods to our advantage when writing Rails applications. Secondly, we can learn many things about Ruby programming by dissecting these parts. They are small and relatively easy to digest.
dependencies.rb
Dependencies autoloads missing constants by trying to find the file associated with the constant. When you attempt to access a nonexistent constant, such as Message, Dependencies will try to find and load message.rb from any directory in Dependencies.load_paths.
Dependencies defines Module#const_missing and Class#const_missing, which both proxy to Dependencies.load_missing_constant(const_parent, const_id). That method searches the load paths for a file with the appropriate name; if found, Dependencies loads the file and ensures that it defined the appropriate constant.
Alternatively, Rails will create an empty module to satisfy nesting in the case of nested models and controllers. If a directorynamed app/models/store/ exists, Store will be created as an empty module, by the following process:
  1. Some piece of code references the undefined constant Store.
  2. Ruby calls const_missing.
  3. const_missing calls Dependencies.load_missing_constant(Object,:Store).
  4. load_missing_constant attempts to find and load store.rb somewhere in its list of load paths (Dependencies.load_paths). It fails to find such a file.
  5. load_missing_constant sees that app/models/store exists and is a directory. It creates a module, assigns it to the appropriate constant, and returns.
deprecation.rb
The ActiveSupport::Deprecation module provides a method bywhich old APIs are marked for removal. At its core, it is just a fancywarning mechanism. When old APIs are used, they generate a warning in development or test mode. Deprecation warnings are invoked through the
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Core Extensions
Inhaltsvorschau
The Core Extensions are ActiveSupport's collection of extensions to Ruby's core classes and modules. They are basic design patterns solving problems that are encountered often in Ruby. These methods are one level below the Rails API; they are the internal functions that Rails uses. However, we describe them here because they are extremely useful during the process of building a Rails application. The core extensions are low-level utility methods for Ruby; they do not make the impossible possible, but they do help to simplify application code.

Conversions

core_ext/array/conversions.rb
  • Array#to_sentence joins the array's elements and converts to a string:
    
    	%w(Larry Curly Moe).to_sentence # => "Larry, Curly, and Moe"
    
    
  • Array#to_s(:db) collects an arrayof ActiveRecord objects (or other objects that respond to the id method) into a SQL-friendly string.
  • Array#to_xml converts an arrayof ActiveRecord objects into XML. This is usually used to implement REST-style web services. It relies on the contained objects' implementation of to_xml (such as ActiveRecord::XmlSerialization.to_xml).
    
    	render :xml => Product.find(:all).to_xml 
    
    
  • Note that render(:xml => …) and render(:json => …) are new synonyms for render(:text => …) that change the response's MIME type appropriately.

Grouping

core_ext/array/grouping.rb
  • Array#in_groups_of(size, fill_with) groups elements of an array into fixed-size groups:
    
    	(1..8).to_a.in_groups_of(3) # => [[1, 2, 3], [4, 5, 6], [7, 8, nil]]
    
    	(1..8).to_a.in_groups_of(3, 0) # => [[1, 2, 3], [4, 5, 6], [7, 8, 0]] 
    
    	(1..8).to_a.in_groups_of(3, false) # => [[1, 2, 3], [4, 5, 6], [7, 8]]
    
    
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
RailTies
Inhaltsvorschau
RailTies is the set of components that wire together ActiveRecord, ActionController, and ActionView to form Rails. We will examine the two most important parts of RailTies: how Rails is initialized and how requests are processed.
The Rails::Configuration class, defined in initializer.rb, holds the configuration attributes that control Rails. It has several general Rails attributes defined as attributes on the Configuration class, but there is a little cleverness in the framework class stubs. The five class stubs (action_controller, action_mailer, action_view, active_resource, and active_record) act as proxies to the class attributes of their respective Base classes. In this way, the configuration statement:

	config.action_controller.perform_caching = true

is the same as:

	ActionController::Base.perform_caching = true

except with a unified configuration syntax.
initializer.rb
Rails::Initializer is the main class that handles setting up the Rails environment within Ruby. Initialization is kicked off by config/environment.rb, which contains the block:

	Rails::Initializer.run do |config|

	  # (configuration)

	end

Rails::Initializer.run yields a new Rails::Configuration object to the block. Then run creates a new Rails::Initializer object and calls its process method, which takes the following steps in order to initialize Rails:
  1. check_ruby_version: Ensures that Ruby1.8.2 or above (but not 1.8.3) is being used.
  2. set_load_path: Adds the framework paths (RailTies, ActionPack, ActiveSupport, ActiveRecord, Action Mailer, and Action Web Service) and the application's load paths to the Ruby load path. The framework is loaded from vendor/rails or a location specified in RAILS_FRAMEWORK_ROOT.
  3. require_frameworks: Loads each framework listed in the frameworks configuration option. If the framework path was not specified in
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Further Reading
Inhaltsvorschau
Diomidis Spinellis's book Code Reading: The Open Source Perspective (Addison-Wesley) offers advice on how to approach large codebases, particularly those of open source software.
The Ruby Facets core library is another collection of code that aims to provide utility methods for Ruby. This library covers some of the same ground as the Core Extensions, but also provides additional extensions.
If you need more complicated manipulations to the English language than the Inflector class allows, look to the Ruby Linguistics project.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Chapter 3: Rails Plugins
Inhaltsvorschau
Civilization advances by extending the number of important operations which we can perform without thinking of them.
—Alfred North Whitehead
Ruby on Rails is very powerful, but it cannot do everything. There are many features that are too experimental, out of scope of the Rails core, or even blatantly contrary to the way Rails was designed (it is opinionated software, after all). The core team cannot and would not include everything that anybody wants in Rails.
Luckily, Rails comes with a very flexible extension system. Rails plugins allow developers to extend or override nearly any part of the Rails framework, and share these modifications with others in an encapsulated and reusable manner.
By default, plugins are loaded from directories under vendor/plugins in the Rails application root. Should you need to change or add to these paths, the plugin_paths configuration item contains the plugin load paths:

	config.plugin_paths += [File.join(RAILS_ROOT, 'vendor', 'other_plugins')]

By default, plugins are loaded in alphabetical order; attachment_fu is loaded before http_authentication. If the plugins have dependencies on each other, a manual loading order can be specified with the plugins configuration element:

	config.plugins = %w(prerequisite_plugin actual_plugin)

Any plugins not specified in config.plugins will not be loaded. However, if the last plugin specified is the symbol :all, Rails will load all remaining plugins at that point. Rails accepts either symbols or strings as plugin names here.

	config.plugins = [ :prerequisite_plugin, :actual_plugin, :all ]

The plugin locator searches for plugins under the configured paths, recursively. Because a recursive search is performed, you can organize plugins into directories; for example, vendor/plugins/active_record_acts and vendor/plugins/view_extensions.
The actual plugin locating and loading system is extensible, and you can write your own strategies. The locator (which by default is
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
About Plugins
Inhaltsvorschau
By default, plugins are loaded from directories under vendor/plugins in the Rails application root. Should you need to change or add to these paths, the plugin_paths configuration item contains the plugin load paths:

	config.plugin_paths += [File.join(RAILS_ROOT, 'vendor', 'other_plugins')]

By default, plugins are loaded in alphabetical order; attachment_fu is loaded before http_authentication. If the plugins have dependencies on each other, a manual loading order can be specified with the plugins configuration element:

	config.plugins = %w(prerequisite_plugin actual_plugin)

Any plugins not specified in config.plugins will not be loaded. However, if the last plugin specified is the symbol :all, Rails will load all remaining plugins at that point. Rails accepts either symbols or strings as plugin names here.

	config.plugins = [ :prerequisite_plugin, :actual_plugin, :all ]

The plugin locator searches for plugins under the configured paths, recursively. Because a recursive search is performed, you can organize plugins into directories; for example, vendor/plugins/active_record_acts and vendor/plugins/view_extensions.
The actual plugin locating and loading system is extensible, and you can write your own strategies. The locator (which by default is Rails::Plugin::FileSystemLocator) searches for plugins; the loader (by default Rails::Plugin::Loader) determines whether a directory contains a plugin and does the work of loading it.
To write your own locators and loaders, examine railties/lib/rails/plugin/locator.rb and railties/lib/rails/plugin/loader.rb. The locators (more than one locator can be used) and loader can be changed with configuration directives:

	config.plugin_locators += [MyPluginLocator]

	config.plugin_loader = MyPluginLoader

Plugins are most often installed with the built-in Rails plugin tool, script/plugin. This plugin tool has several commands:
discover/source/unsource/sources
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Writing Plugins
Inhaltsvorschau
Once you know how to extend Rails by opening classes, it is easy to write a plugin. First, let's look at the directory structure of a typical plugin (see ).
Figure : Directory structure of a typical plugin
There are several files and directories involved in a Rails plugin:
about.yml (not shown)
This is the newest feature of Rails plugins—embedded metadata. Right now, this feature works only with RaPT. The command rapt about plugin_name will give a summary of the plugin's information. In the future, more features are expected; right now, it exists for informational purposes. Metadata is stored in the about.yml file; here is an example from acts_as_attachment:

	author: technoweenie 

	summary: File upload handling plugin.

	homepage: http://technoweenie.stikipad.com

	plugin: http://svn.techno-weenie.net/projects/plugins/acts_as_attachment

	license: MIT

	version: 0.3a

	rails_version: 1.1.2+

init.rb
This is a Ruby file run upon initialization of the plugin. Typically, it will require files from the lib/ directory. As many plugins patch core functionality, init.rb may extend core classes with extensions from the plugin:

	require 'my_plugin'

	ActionController::Base.send :include, MyPlugin::ControllerExtensions

The send hack is needed here because Module#include is a private method and, at least for now, send bypasses access control on the receiver.
install.rb (not shown)
This hook is run when the plugin is installed with one of the automated plugin installation tools such as script/plugin or RaPT. It is a good idea not to do any-thing mission-critical in this file, as it will not be run if the plugin is installed manually (by checking out the source to a directory under vendor/plugins).
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Plugin Examples
Inhaltsvorschau
To illustrate the flexibility and design of a typical Rails plugin, we will examine some of the plugins available from the rubyonrails.org Subversion repository. Most of these plugins are used fairly commonly; many of them are used in 37signals applications. Consider them the "standard library" of Rails. They are all available from http://svn.rubyonrails.org/rails/plugins
Plugins can be very simple in structure. For example, consider David Heinemeier Hansson's account_location plugin. This plugin provides controller and helper methods to support using part of the domain name as an account name (for example, to support customers with domain names of customer1.example.com and customer2.example.com, using customer1 and customer2 as keys to look up the account information). To use the plugin, include AccountLocation in one or more of your controllers, which adds the appropriate instance methods:

	class ApplicationController < ActionController::Base

	  include AccountLocation

	end



	puts ApplicationController.instance_methods.grep /^account/

	=> ["account_domain", "account_subdomain", "account_host", "account_url"]

Including the AccountLocation module in the controller allows you to access various URL options from the controller and the view. For example, to set the @account variable from the subdomain on each request:

	class ApplicationController < ActionController::Base

	  include AccountLocation

	  before_filter :find_account



	  protected



	  def find_account

	    @account = Account.find_by_username(account_subdomain)

	  end

	end

The account_location plugin has no init.rb; nothing needs to be set up on load, as all functionality is encapsulated in the AccountLocation module. Here is the implementation, in lib/account_location.rb (minus some license text):

	module AccountLocation

	  def self.included(controller)

	    controller.helper_method(:account_domain, :account_subdomain,

	      :account_host, :account_url)

	  end



	  protected



	  def default_account_subdomain

	    @account.username if @account && @account.respond_to?(:username)

	  end



	  def account_url(account_subdomain = default_account_subdomain,

	      use_ssl = request.ssl?)

	    (use_ssl ? "https://" : "http://") + account_host(account_subdomain)

	  end



	  def account_host(account_subdomain = default_account_subdomain)

	    account_host = ""

	    account_host << account_subdomain + "."

	    account_host << account_domain

	  end



	  def account_domain

	    account_domain = ""

	    account_domain << request.subdomains[1..-1].join(".") +

	      "." if request.subdomains.size > 1

	    account_domain << request.domain + request.port_string

	  end



	  def account_subdomain

	    request.subdomains.first

	  end

	end

Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Testing Plugins
Inhaltsvorschau
Like the rest of Rails, plugins have very mature testing facilities. However, plugin tests usually require a bit more work than standard Rails tests, as the tests are designed to be run on their own, outside of the Rails framework. Some things to keep in mind when writing tests for plugins:
  • Unlike in the Rails plugin initializer, when running tests, load paths are not set up automatically, and Dependencies does not load missing constants for you. You need to manually set up the load paths and require any parts of the plugin that you will be testing, as in this example from the HTTP Authentication plugin:
    
    	$LOAD_PATH << File.dirname(__FILE__) + '/../lib/'
    
    	require 'http_authentication'
    
    
  • Similarly, the plugin's init.rb file is not loaded, so you must set up anything your tests need, such as including your plugin's modules in the TestCase class:
    
    	class HttpBasicAuthenticationTest < Test::Unit::TestCase
    
    	  include HttpAuthentication::Basic
    
    
    
    	  # …
    
    	end
    
    
  • You must usually recreate (mock or stub) any Rails functionality involved in your test. In the case of the HTTP Authentication plugin, it would be too much overhead to load the entire ActionController framework for the tests. The functionality being tested is very simple, and requires very little of ActionController:
    
    	def test_authentication_request
    
    	  authentication_request(@controller, "Megaglobalapp")
    
    	  assert_equal 'Basic realm="Megaglobalapp"',
    
    	               @controller.headers["WWW-Authenticate"] 
    
    	  assert_equal :unauthorized, @controller.renders.first[:status] 
    
    	end
    
    
    To support this limited subset of ActionController's features, the test's setup method creates a stub controller:
    
    	def setup
    
    	  @controller = Class.new do
    
    	    attr_accessor :headers, :renders
    
    
    
    	    def initialize
    
    	      @headers, @renders = {}, []
    
    	    end
    
    
    
    	    def request
    
    	      Class.new do
    
    	        def env
    
    	          {'HTTP_AUTHORIZATION' => 
    
    	            HttpAuthentication::Basic.encode_credentials("dhh", "secret") }
    
    	        end
    
    	      end.new
    
    	    end
    
    
    
    	    def render(options)
    
    	      self.renders << options
    
    	    end
    
    	  end.new
    
    	end
    
    
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Further Reading
Inhaltsvorschau
Geoffrey Grosenbach has a two-part article on Rails plugins, including some information on writing plugins. The two parts are available from the following:
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Chapter 4: Database
Inhaltsvorschau
All non-trivial abstractions, to some degree, are leaky.
—Joel Spolsky
For many developers, Rails starts with the database. One of the most compelling features of Rails is ActiveRecord, the object-relational mapping(ORM) layer. ActiveRecord does such a good job of hiding the gory details of SQL from the programmer that it almost seems like magic.
However, as Joel Spolsky says, all abstractions are leaky. There is no perfectly transparent ORM system, and there never will be, due to the fundamentally different nature of the object-oriented and relational models. Ignore the underlying database at your own peril.
The Rails community has been built around the MySQL database management system (DBMS ) for years. However, there are still a lot of misconceptions surrounding DBMSs, especially when used with Rails. While MySQL has its place, it is certainly not the only option. In the past few years, support for other databases has vastly grown. I encourage you to keep an open mind throughout this chapter, and weigh all criteria before making a decision on a DBMS.
Rails supports many DBMSs; at the time of this writing, DB2, Firebird, FrontBase, MySQL, OpenBase, Oracle, PostgreSQL, SQLite, Microsoft SQL Server, and Sybase are supported. You will probably know if you need to use a DBMS other than the ones mentioned here. Check the RDoc for the connection adapter for any caveats specific to your DBMS; some features such as migrations are only supported on a handful of connection adapters.
I list PostgreSQL first because it is my platform of choice. It is one of the most advanced open source databases available today. It has a long history, dating back to the University of California at Berkeley's Ingres project from the early 1980s. In contrast to MySQL, Postgres has supported advanced features such as triggers, stored procedures, custom data types, and transactions for much longer.
PostgreSQL's support for concurrency is more mature than MySQL's. Postgres supports multiversion concurrency control (MVCC), which is even more advanced than row-level locking. MVCC can isolate transactions, using timestamps to give each concurrent transaction its own snapshot of the data set. Under the
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Database Management Systems
Inhaltsvorschau
The Rails community has been built around the MySQL database management system (DBMS ) for years. However, there are still a lot of misconceptions surrounding DBMSs, especially when used with Rails. While MySQL has its place, it is certainly not the only option. In the past few years, support for other databases has vastly grown. I encourage you to keep an open mind throughout this chapter, and weigh all criteria before making a decision on a DBMS.
Rails supports many DBMSs; at the time of this writing, DB2, Firebird, FrontBase, MySQL, OpenBase, Oracle, PostgreSQL, SQLite, Microsoft SQL Server, and Sybase are supported. You will probably know if you need to use a DBMS other than the ones mentioned here. Check the RDoc for the connection adapter for any caveats specific to your DBMS; some features such as migrations are only supported on a handful of connection adapters.
I list PostgreSQL first because it is my platform of choice. It is one of the most advanced open source databases available today. It has a long history, dating back to the University of California at Berkeley's Ingres project from the early 1980s. In contrast to MySQL, Postgres has supported advanced features such as triggers, stored procedures, custom data types, and transactions for much longer.
PostgreSQL's support for concurrency is more mature than MySQL's. Postgres supports multiversion concurrency control (MVCC), which is even more advanced than row-level locking. MVCC can isolate transactions, using timestamps to give each concurrent transaction its own snapshot of the data set. Under the Serializable isolation level, this prevents such problems as dirty reads, nonrepeatable reads, and phantom reads. See the upcoming sidebar, "Multiversion Concurrency Control," for more information about MVCC.
One advantage that PostgreSQL may have in the enterprise is its similarity to commercial enterprise databases such as Oracle, MS SQL Server, or DB2. Although Postgres is not by any means a clone or emulation of any commercial database, it will nevertheless be familiar to programmers and DBAs who have experience with one of the commercial databases. It will also likely be easier to migrate an application from Postgres to (say) Oracle than from MySQL to Oracle.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Large/Binary Objects
Inhaltsvorschau
Sooner or later, many web applications must deal with the issue of LOB (large object) data. LOB data may be small, but it is usually large compared to other attributes being stored (tens of kilobytes to hundreds of gigabytes or larger). The defining characteristic of LOB data, however, is that the application has no knowledge of the semantics of the internal structure of the data.
The canonical example is image data; a web application usually has no need to know the data in a JPEG file representing a user's avatar as long as it can send it to the client, replace it, and delete it when needed.
LOB storage is usually divided into CLOB (character large object) for text data and BLOB (binary large object) for everything else. Some DBMSs separate the two as separate data types. CLOB types can often be indexed, collated, and searched; BLOBs cannot.
The DBA types among us might prefer database storage of large objects. From a theoretical standpoint, storing binary data in the database is the most clean and straight-forward solution. It offers some immediate advantages:
  • All of your application data is in the same place: the database. There is only one interface to the data, and one program is responsible for managing the data in all its forms.
  • You have greater flexibility with access control, which really helps when working with large-scale projects. DBMS permitting, different permissions may be assigned to different tables within the same database.
  • The binary data is not tied to a physical file path; when using filesystem storage, you must update the file paths in the referring database if you move the storage location.
There are many practical considerations, though, depending on your DBMS's implementation of large objects.

PostgreSQL

PostgreSQL has some downright weird support for binary data. There are two ways to store binary data in a PostgreSQL database: the BYTEA data type and large objects.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Advanced Database Features
Inhaltsvorschau
Among Rails programmers, advanced database features are often a point of contention. Some contend that constraints, triggers, and procedures are essential; some shun them completely, saying that intelligence belongs in the application only. I am sympathetic to the argument that all business logic belongs in the application; it is nearly impossible to make agile changes to changing requirements when logic is split between two locations. Still, I believe that constraints, triggers, and even stored procedures have their place in enterprise applications. In order to explain why, we'll have to examine a distinction that comes up often in relation to this debate: the difference between application and integration databases.
Martin Fowler differentiates between application databases and integration data-bases. The basic distinction is that an integration database is shared among many applications, while an application database "belongs" to the one application using it.
In this sense, "application" can mean one program or multiple programs within an application boundary (the same logical application). Usually this distinction refers to how the schema is organized; in Rails, integration databases are often referred to as databases with "legacy schemas." In application databases, integration can still be performed through messaging at the application layer rather than the database layer.
Rails is opinionated about how your database schemas should be structured: the primary key should be id, foreign keys should be thing_id, and table names should be plural. This is not database bigotry; Rails has to choose a sensible default for the "convention over configuration" paradigm to be effective. It is relatively painless to change almost any of these defaults. Rails plays nice with integration databases.
Many Rails developers shun integration databases as unnecessary; they maintain that all integration should be done at the application layer. Some take that a step further and state that data integrity checking belongs in the application only, to keep all business logic in the same place. Although this might be ideal, the real world is not always that nice. Even if all integration can be done at the application level, there are still plenty of valid reasons to use database constraints.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Connecting to Multiple Databases
Inhaltsvorschau
Occasionally, you will have the need to connect to several different databases from one application. This is useful for migrating from an old schema to a new one. It is also helpful if you have differing data requirements within one application; perhaps some data is more critical and is stored on a high-availability database cluster. In any case, it is easy in Rails. First, specify multiple database environments in the database.yml configuration file:

	legacy:

	  adapter: mysql 

	  database: my_db

	  username: user

	  password: pass

	  host: legacy_host



	new:

	  adapter: mysql

	  database: my_db 

	  username: user 

	  password: pass 

	  host: new_host

Then, you can simply refer to these configuration blocks from the ActiveRecord class definition using the ActiveRecord::Base.establish_connection method:

	class LegacyClient < ActiveRecord::Base

	  establish_connection "legacy"

    end



	class Client < ActiveRecord::Base

	  establish_connection "new"

	end

This approach also works with multiple Rails environments. Just specify each environment in the database.yml file as usual:

	legacy_development:

	  # ...



	legacy_test:

	  # ...



	legacy_production:

	  # ...



	new_development:

	  # ...



	new_test:

	  # ...



	new_production:

	  # ...

Then, use the RAILS_ENV constant in the database configuration block name:

	class LegacyClient < ActiveRecord::Base

	  establish_connection "legacy_#{RAILS_ENV}"

	end



	class Client < ActiveRecord::Base

      establish_connection "new_#{RAILS_ENV}"

	end

You can go one step further and DRY this code up by using class inheritance to define which database an ActiveRecord class belongs to:

	class LegacyDb < ActiveRecord::Base

	  self.abstract_class = true

	  establish_connection "legacy_#{RAILS_ENV}"

	end



	class NewDb < ActiveRecord::Base

	  self.abstract_class = true

	  establish_connection "new_#{RAILS_ENV}"

	end



	class LegacyClient < LegacyDb

	end



	class Client < NewDb

	end

The self.abstract_class = true statements tell ActiveRecord that the LegacyDb
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Caching
Inhaltsvorschau
If you have far more reads than writes, model caching may help lighten the load on the database server. The standard in-memory cache these days is memcached. Developed for LiveJournal, memcached is a distributed cache that functions as a giant hashtable. Because of its simplicity, it is scalable and fast. It is designed never to block, so there is no risk of deadlock. There are four simple operations on the cache, each completing in constant time.
You can actually use memcached in several different places in Rails. It is available as a session store or a fragment cache store out of the box, assuming the ruby-memcache gem is installed. It can also be used to store complete models—but remember that this will only be effective for applications where reads vastly outnumber writes. There are two libraries that cover model caching: cached_model and acts_as_cached.
The cached_model library (http://dev.robotcoop.com/Libraries/cached_model/index. html) provides an abstract subclass of ActiveRecord::Base, CachedModel. It attempts to be as transparent as possible, just caching the simple queries against single objects and not trying to do anything fancy. It does have the disadvantage that all cached models must inherit from CachedModel. Use of cached_model is dead simple:

	class Client < CachedModel

	end

On the other hand, the acts_as_cached plugin (http://errtheblog.com/post/27) gives you more specificity over what is cached. It feels more like programming against memcached's API, but there is more power and less verbosity. It has support for relationships between objects, and it can even version each key to invalidate old keys during a schema change. A sample instance of acts_as_cached might look like this:

	class Client < ActiveRecord::Base 

	  acts_as_cached

	  # We have to expire the cache ourselves upon significant changes

	  after_save :expire_me

	  after_destroy :expire_me

	  

	  protected



	  def expire_me

	    expire_cache(id)

	  end

	end

Of course, the proper solution for you will depend on the specific needs of the application. Keep in mind that any caching is primarily about optimization, and the old warnings against premature optimization always apply. Optimization should always be targeted at a specific, measured performance problem. Without specificity, you don't know what metric you are (or should be) measuring. Without measurement, you don't know when or by how much you've improved it.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Load Balancing and High Availability
Inhaltsvorschau
Many applications require some form of load balancing and/or high availability. Though these terms are often used together and they can often be obtained by the same methods, they are fundamentally two different requirements. We define them thus:
Load balancing
Spreading request load over several systems so as to reduce the load placed on a single system.
High availability
Resiliency to the failure of one or several constituent components; the ability to continue providing services without interruption despite component failure.
These are completely different things, but they are often required and/or provided together. It is important to understand the difference between them in order to properly analyze the requirements of an application. It is possible to provide load balancing without high availability—for example, consider a group of servers presented to the Internet via round-robin DNS. The load is distributed roughly equally over the group of servers, but the system is certainly not highly available! If one server goes down, DNS will still faithfully distribute requests to it, and every one in N requests will go unanswered.
Conversely, high availability can be provided without load balancing. High availability necessitates the use of redundant components, but nothing says that those components must be online and in use. A common configuration is the hot spare: a duplicate server that stays powered up but offline, continually monitoring its online twin, ready to take over if necessary. This can actually be more economical than trying to balance requests between the two servers and keep them in sync.
In this section, we review the primary load balancing and high availability solutions for common database management systems.

Replication

MySQL has built-in support for master-slave replication. The master logs all transactions to a
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
LDAP
Inhaltsvorschau
LDAP, the Lightweight Directory Access Protocol, is a database system optimized for user directory information. It is most often used in large organizations, integrated with the enterprise authentication and email systems. However, it is a database in its own right. We do not have space to cover LDAP in detail, but there are many resources available for working with LDAP in Rails.
The ActiveLDAP library (http://ruby-activeldap.rubyforge.org/) is an almost drop-in replacement for ActiveRecord that uses LDAP instead of an RDBMS as a backend. To use it from Rails, set up a configuration file, config/ldap.yml, as follows:

	development:

	  host: (ldap server name)

	  port: 389

	  base:dc=mycompany,dc=com

	  password: my_password



	production:

	  ...

Then, at the bottom of config/environment.rb, set up the connection:

	ldap_path = File.join(RAILS_ROOT,"config","ldap.yml")

	ldap_config = YAML.load(File.read(ldap_path))[RAILS_ENV]

	ActiveLDAP::Base.establish_connection(ldap_config)

To set up ActiveLDAP, just subclass ActiveLDAP::Base and set the LDAP mapping on a class-by-class basis:

	class Employee < ActiveLDAP::Base

	  ldap_mapping :prefix => "ou=Employees"

	end

LDAP queries can then be executed using the class methods on ActiveLDAP::Base:
	

	@dan = Employee.find :attribute => "cn", :value => "Dan"

One of the most common reasons for using LDAP is to integrate into an existing authentication structure. If an LDAP server is provided for a Windows domain, this will allow the web application to authenticate users against that domain rather than maintaining its own user models separately.
Set up the ldap.yml file as described previously (without specifying a password), but do not bind to the LDAP server from environment.rb. We will perform the bind as part of the authentication process. The following code is adapted from the Rails wiki:

	class LdapUser < ActiveLDAP::Base

	  ldap_mapping :prefix => (LDAP prefix for your users)

	

	  LDAP_PATH = File.join(RAILS_ROOT,"config","ldap.yml")

	  LDAP_CONFIG = YAML.load(File.read(ldap_path))[RAILS_ENV]



	  def self.authenticate username, password 

	    begin

	      ActiveLDAP::Base.establish_connection(config.merge(

	        :bind_format => "uid=#{username},cn=users,dc=mycompany,dc=com",

	        :password => password,

	        :allow_anonymous => false

	      ))

	      ActiveLDAP::Base.close

	      return true

	

	      rescue ActiveLDAP::AuthenticationError

	        return false

	    end

	  end

	end

Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Further Reading
Inhaltsvorschau
Chris Date's Database in Depth (O'Reilly) is a very accessible introduction to relational theory aimed at software developers who are experienced in the use of relational databases. It reintroduces readers into the technical foundations behind the relational model.
Theo Schlossnagle's Scalable Internet Architectures (Sams) is a short but comprehensive treatment of ways to accomplish scalability (both high availability and load balancing are covered); it covers ground from the smallest two-server failover cluster up to global server load balancing.
Both the MySQL manual (http://dev.mysql.com/doc/) and the PostgreSQL manual (http://www.postgresql.org/docs/) have a wealth of information about general database topics, as well as specific information pertaining to the use of those DBMSs.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Chapter 5: Security
Inhaltsvorschau
Given a choice between dancing pigs and security, users will pick dancing pigs every time.
—Ed Felten and Gary McGraw
Security issues are often overlooked on smaller sites or low-traffic applications; unfortunately, the reach of the Web has expanded to a point where end-to-end security is essential on any public-facing web site. There actually are people with nothing better to do than run a distributed denial-of-service attack on "Aunt Edna's Funny Cat Pictures." Nobody can afford to ignore the dangers that face a site simply as a consequence of being accessible on the Internet.
In this chapter, we will take a top-down approach to examining the various security-related issues that plague web application developers. We start by examining the architectural, application-level principles you should keep in mind. Later, we will get progressively more detailed. We will examine the security-related issues you should keep in mind when working at a lower level in Rails.
First, we will examine some important principles that should guide the design of any web application.
The most important guideline in the area of authentication is simple:

          Always salt and hash all passwords!

        
There are very few valid exceptions to this rule, and even fewer apply to web applications. The only possible reason to store passwords in plain text is if they must be provided to an external service in plain text. Even then, the passwords should be symmetrically encrypted with a shared secret, to provide defense in depth.
Let's examine the reasoning behind this rule. Hashing passwords prevents them from being recovered if the database or source code is compromised. Salting them protects them from rainbow attacks.
Salting is the process of ensuring that the same password hashes to different values for different users. Consider the following code, which hashes but does not salt.

	require 'digest/sha1'



	$hashes = {}



	def hash(password)

	  Digest::SHA1.hexdigest(password)

	end



	def store_password(login, password)

	  $hashes[login] = hash(password)

	end



	def verify_password(login, password)

	  $hashes[login] == hash(password)

	end



	store_password('alice', 'kittens')

	store_password('bob',   'kittens')



	$hashes # => {"alice"=>"3efd62ee86d4a141c3e671d86ba1579f934cf04d",

	        # "bob"=> "3efd62ee86d4a141c3e671d86ba1579f934cf04d"}



	verify_password('alice', 'kittens') # => true

	verify_password('alice', 'mittens') # => false

	verify_password('bob',   'kittens') # => true

Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Application Issues
Inhaltsvorschau
First, we will examine some important principles that should guide the design of any web application.
The most important guideline in the area of authentication is simple:

          Always salt and hash all passwords!

        
There are very few valid exceptions to this rule, and even fewer apply to web applications. The only possible reason to store passwords in plain text is if they must be provided to an external service in plain text. Even then, the passwords should be symmetrically encrypted with a shared secret, to provide defense in depth.
Let's examine the reasoning behind this rule. Hashing passwords prevents them from being recovered if the database or source code is compromised. Salting them protects them from rainbow attacks.
Salting is the process of ensuring that the same password hashes to different values for different users. Consider the following code, which hashes but does not salt.

	require 'digest/sha1'



	$hashes = {}



	def hash(password)

	  Digest::SHA1.hexdigest(password)

	end



	def store_password(login, password)

	  $hashes[login] = hash(password)

	end



	def verify_password(login, password)

	  $hashes[login] == hash(password)

	end



	store_password('alice', 'kittens')

	store_password('bob',   'kittens')



	$hashes # => {"alice"=>"3efd62ee86d4a141c3e671d86ba1579f934cf04d",

	        # "bob"=> "3efd62ee86d4a141c3e671d86ba1579f934cf04d"}



	verify_password('alice', 'kittens') # => true

	verify_password('alice', 'mittens') # => false

	verify_password('bob',   'kittens') # => true

Although this is more secure than storing the passwords in plain text, it is still insecure; anyone who has the hash file can tell that Alice and Bob have the same password.
More importantly, this scheme is vulnerable to a rainbow attack. An attacker can precompute rainbow tables by running every word in a dictionary through the hash function. He can then compare each hash in the rainbow table to each hash in the password file. Since a password always hashes to the same value, the attacker obtains all the dictionary passwords in one fell swoop.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Web Issues
Inhaltsvorschau
Now that we have examined some of the architectural ways that you can protect your application, we will take a look at some of the issues endemic to the Web.
Most web frameworks have some form of session management: a persistent serverside storage mechanism for data specific to one client's browsing session. The exact scope of a "browsing session" depends on implementation details and the method of session tracking. Most commonly, a non-persistent cookie is used, so a session consists of all visits to a site before closing the browser. Alternatively, a persistent cookie (one with an explicit expiration date) can be used; this will persist even when the browser is closed. This is useful to remember information (such as a shopping cart) across visits for otherwise anonymous users. Some frameworks such as Seaside provide URL-based (query-string) sessions so that a user may even have multiple sessions active at the same time in different browser windows.
Most of Rails's session storage methods provide the following properties:
Confidentiality
Nobody except the server can read the data stored in the session.
Integrity
Nobody except the server, including the client itself, can modify the data stored in the session other than by throwing the session out and obtaining a new one. A corollary is that only the server should be able to create valid sessions.
The traditional session storage methods in Rails are server-side; they store all of the session data on the server, generate a random key, and use that as the session ID. The session ID is not tied to the data other than as an index, so it is safe to present to the client without compromising confidentiality.
Rails uses as much randomness as possible to create the session ID: it takes an MD5 hash of the current time, a random number, the process ID, and a constant. This is important: guessable session IDs allow
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
SQL Injection
Inhaltsvorschau
SQL injection is an attack against programs that do not take proper precautions when accessing a SQL-based database. A standard example of vulnerable code is:

	search = params[:q]

	Person.find_by_sql %(SELECT * FROM people WHERE name = '#{search}%')

Of course, all someone has to do is search for "'; DROP TABLE people; --", which yields the following statement:

	SELECT * FROM people WHERE name = ''; DROP TABLE people; --%';

Everything after the -- is treated as a SQL comment (otherwise, the attempt might cause a SQL error). First, the SELECT statement is executed; then the DROP TABLE statement causes havoc. Ideally, the database user that executes that statement should not have DROP TABLE privileges, but SQL injection is always damaging. There are plenty of other attack vectors.
Another typical example of SQL injection is a query such as "'OR 1 = 1; --", which yields:

	SELECT * FROM people WHERE name = '' OR 1 = 1; --%';

This query would return all records from the people table. This can have security implications, especially when this sort of code is found in authentication systems.
For applications written against the standard APIs, Rails is amazingly well protected against SQL injection attacks. All of the standard finders and dynamic attribute finders sanitize single attribute arguments, but there is only so much that they can do. Remember the cardinal rule: never interpolate user input into a SQL string.
Most of the Rails finders that accept SQL also accept an array, so you can turn code like "SELECT * FROM people WHERE name = '#{search_name}'" into ["SELECT * FROM people WHERE name = ?", search_name] nearly anywhere. (Note the lack of quoting around the question mark; Rails interprets the type of search_name and quotes it appropriately.) The user-provided name value will have any special SQL characters escaped, so you don't have to worry about it.
For any situations where you need to do this quoting yourself, you can steal the private sanitize_sql method from ActiveRecord::Base (just don't tell anyone):
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Ruby's Environment
Inhaltsvorschau
No analysis of Rails security would be complete without examining the environment that Ruby lives in.
The Kernel.system method is useful for basic interaction with system services through the command line. As with SQL, though, it is important to ensure that you know exactly what is being passed, especially if it comes from an external source.
The best way to protect against malicious user input making it to the shell is to use the multiparameter version of system, only passing the command name in the first parameter. The subsequent parameters are shell-escaped and passed in, which makes it much harder to slip something into the command line unnoticed:

	def svn_commit(message)

	  system("/usr/local/bin/svn", "ci", "-m", message)

	end

The message passed in to that method will always be the third parameter to svn, no matter what kind of shell metacharacters it contains.
Tainting is an idea that came to Ruby from Perl. Because data that comes from the outside is not to be trusted, why not force it not to be trusted by default? Any data read from the environment or outside world is marked as tainted. Depending on the current value of a special Ruby global, $SAFE, certain operations are prohibited on tainted data. Objects may only (officially) be untainted by calling their untaint method.
This is a good idea that, because of implementation details, has not gained much traction in the Rails community. It can become a pain to deal with every piece of data that was derived from user input. There is one Rails plugin, safe_erb, which leverages tainting to ensure that all user-supplied data is HTML-escaped before being displayed again. Request parameters and cookies are tainted upon each request, and an error is raised if tainted data is attempted to be rendered. (The Ruby tainting facility is not used other than as a flag on the objects, because anything more would require a $SAFE level greater than zero, which is Rails-unfriendly.) This reduces the possibility of cross-site scripting attacks. The plugin is available at
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Further Reading
Inhaltsvorschau
The HTTP/1.1 specification, RFC 2616, has some guiding principles for security at the HTTP level (http://www.w3.org/Protocols/rfc2616/rfc2616-sec15.html).
Current Rails best practices for security are summarized at http://www.quarkruby.com/2007/9/20/ruby-on-rails-security-guide. This guide provides "cookbook"-style solutions for many real-world problems such as authentication; mitigating SQL injection, XSS, and CSRF; handling file uploads; and preventing form spam.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Chapter 6: Performance
Inhaltsvorschau
Premature optimization is the root of all evil (or at least most of it) in programming.
—Donald Knuth (attributed to C. A. R. Hoare)
Performance is an interesting beast. Performance optimization often has a bad reputation because it is often performed too early and too often, usually at the expense of readability, maintainability, and even correctness. Rails is generally fast enough, but it is possible to make it slow if you are not careful.
You should keep the following guidelines in mind when optimizing performance:
Algorithmic improvements always beat code tweaks
It is very tempting to try to squeeze every last bit of speed out of a piece of code, but often you can miss the bigger picture. No amount of C or assembly tweaking will make bubblesort faster than quicksort. Start your optimization from the top down.
As a general rule, maintainability beats performance
Your code should be first easy to read and understand, and only then optimized for speed.
Only optimize what matters
Typically, the code profile has a lopsided distribution: 80% of the time is spent in 20% of the code (for some value of 80% and 20%). It makes sense to spend your limited resources optimizing the sections that will bring the greatest gain in performance.
Measure twice, cut once
The only way to be certain about where your code is spending its time is to measure it. And just as in carpentry, you can waste a lot of time if you make changes without being very sure exactly what those changes should be. In this chapter, we will explore some of the best tools and methods for determining where to cut.
Of course, in order to properly measure performance, we need tools. This section is concerned with analysis of Ruby and Rails code, as well as web applications in general. There are a series of tools that can be used to analyze the full Rails stack, from HTTP down to Ruby's internals.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Measurement Tools
Inhaltsvorschau
Of course, in order to properly measure performance, we need tools. This section is concerned with analysis of Ruby and Rails code, as well as web applications in general. There are a series of tools that can be used to analyze the full Rails stack, from HTTP down to Ruby's internals.
The most basic high-level measurement you will be interested in is: in the ideal case, how fast can this server serve requests? While the answer is a somewhat nebulous value that often bears no relation to actual performance under typical load, it is still useful to compare against itself—for example, when testing caching or deploying a new feature set.
This technique is called black-box analysis: we measure how much traffic the server can handle, while treating it as a "black box." For now, we don't really care what's inside the box, only about how fast it can serve requests. We will leave the minutiae of profiling and tweaking until later.
For this stage, we will need a benchmarking utility—but first, a brief diversion into the world of mathematics.

Statistics: The least you need to know

It doesn't take much knowledge of statistics to properly interpret the results of blackbox analysis, but there are some things you need to know.
Statistical analysis deals with the results of multiple samples, which in this case correspond to HTTP response times. In Ruby fashion, we will illustrate this with a Ruby array:

	samples = %w(10 11 12 10 10).map{|x|x.to_f}

The average, or mean, of these samples is their sum divided by the number of samples. This is straightforward to translate into Ruby—adding a few methods to Enumerable:

	module Enumerable

	  def sum(identity = 0)

	    inject(identity) {|sum, x| sum + x}

	  end



	  def mean

	    sum.to_f / length

	  end

	end

This gives us predictable results:

	samples.sum    # => 53.0

	samples.length # => 5

	samples.mean   # => 10.6

Everyone is familiar with the mean, but the problem is that by itself, the mean is nearly worthless for describing a data set. Consider these two sets of samples:
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Rails Optimization Example
Inhaltsvorschau
To tie these concepts together, we will look at the process of benchmarking, profiling, and optimizing a Rails action. This example comes from a real application, one that is fairly large and complicated. We have seen pieces of this application before; it is a map-based real estate search application. The application deals heavily with geospatial data, and is based on the PostGIS spatial extensions to PostgreSQL.
We have identified an action to profile: the action that performs the search itself (POST/searches). This action is not particularly slow in absolute terms, but it is our most commonly used feature, and any more performance we can get reduces overall latency and makes our application feel snappier.
Once we have decided on an action whose performance we want to improve, we can profile it to see where our time is being spent. Jeremy Kemper recently added a new request profiler to Rails, which will be released with the final version of Rails 2.0. Its library is located at actionpack/lib/action_controller/request_profiler.rb, and it is accessible through script/performance/request. It is a fairly simple wrapper around the ruby-prof library, adding some commonly needed Rails functionality:
  • Rather than running a single action, the request profiler runs a specified integration test script, so the test procedure can be arbitrarily complex. This also means that we can profile non-GET requests, which is a must for the action we wish to profile.
  • The request profiler can run a script multiple times, while only profiling time actually spent in those actions (not the overhead of starting up the profiler).
  • The profiler script opens up both the flat and HTML graph profiles for us; we will see later how both of these are useful.
First, we need to install the ruby-prof gem:

	$ sudo gem install ruby-prof

We need to write an integration script that drives the profiler. As mentioned previously, this script can be arbitrarily complicated, but ours will be a single request. The script uses the same methods as integration scripts, except that the script's execution is wrapped in an integration runner (technically, the script's text is inserted into the runner using
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
ActiveRecord Performance
Inhaltsvorschau
Object-relational mapping systems provide such a high-level environment for working with data that it is easy to forget about efficiency until it becomes a problem. Here are some common problems and solutions for ActiveRecord development.
When faced with a problem that doesn't map neatly to the given abstractions, most programmers have an instinct to jump down a level. When using ActiveRecord, this means using raw SQL.
For security as well as performance, it is important to understand the SQL that is being generated from the commands you issue. ActiveRecord provides a useful abstraction, but if you are not careful, it will bite you.
The simplest way to drop into raw SQL is to use ActiveRecord::Base.find_by_sql. This is a very flexible method that returns the same results as find(:all, …), but allows you to specify custom SQL. It will even sanitize an array for you:

	Person.find_by_sql ["SELECT * FROM people WHERE name LIKE ?", "#{name}%"]

The problem with find_by_sql is that it instantiates every object that is returned. This is usually fine, but sometimes it can be too much overhead. To avoid this, you may need to bypass ActiveRecord and talk directly with the connection adapter. This is fairly easy, but you can make it easier by bolting some convenience methods onto ActiveRecord to sanitize the query automatically:

	class <<ActiveRecord::Base

	  def select_values(sql)

	    connection.select_values(sanitize_sql(sql))

	  end

	end



	sql = %(SELECT id FROM people WHERE last_name = ?)

	last_name = %(O'Reilly)



	Person.select_values [sql, last_name] # => ["12", "42"]

Because ActiveRecord is not doing any of the work here, the values come across without any type conversion (as strings here). The complete list of methods available through the connection adapter (the ActiveRecord::Base.connection object) is listed in the RDoc for ActiveRecord::ConnectionAdapters::DatabaseStatements.
The so-called 1+N problem is characteristic of the problems that you can run into if you are not aware of your tools. It is best illustrated with an example.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Architectural Scalability
Inhaltsvorschau
One of the hardest parts of building and deploying a web application is growing it. Luckily, Rails was designed with scalability in mind. The Rails scalability mantra is shared-nothing—the idea that each application server should stand on its own, not having to coordinate with other application servers to handle a request. The only thing that needs to be shared when scaling upward is the database.
Nevertheless, there are a few other concerns that you should be aware of when scaling a Rails application. The biggest concerns are the other shared state besides the application data: storage for sessions and cached data.
The Rails session infrastructure is built on top of Ruby's CGI::Session from the standard library. CGI::Session takes care of the basics of CGI session management. It provides the following session stores, each implemented as a class within CGI::Session:
FileStore
Stores data in a file as plain text. No attempt is made to marshal the data, so you must convert any session data into a String first.
MemoryStore
Stores session data natively in the memory of the Ruby interpreter process.
PStore
Similar to FileStore, but marshals the data before storing it. This allows you to store any type of data in the session. This is a good option for development, but it is not suitable for a production environment.
Because these options are quite thin and not too suited for large-scale web applications, Rails provides some session managers that are more helpful. In particular, all of these options enable sessions to be shared between different application servers. These implement the same interface as the other CGI::Session stores, so they are drop-in replacements. We will examine each one in detail here.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Other Systems
Inhaltsvorschau
The remainder of this chapter is a collection of miscellaneous performance tips and solutions to common problems. If you have specific trouble, the Rails wiki (http://wiki.rubyonrails.com/) might help. The wiki is disorganized at times, but it has a large amount of relevant information on many topics if you are willing to search.
A large part of software development consists of selecting the right tools for the job. This encompasses not only languages but libraries, frameworks, source control, databases, servers, and all of the other tools and materials that go into a completed application.

Leveraging external programs

Sometimes the best way to solve a problem is not to have a problem at all. Chances are, if you have a moderately complicated technical problem, someone else has solved it. 37signals' Basecamp takes this approach when resizing images—rather than dealing with the hassle of installing RMagick, they just shell out to ImageMagick:

	def thumbnail(temp, target)

	  system(

	    "/usr/local/bin/convert #{escape(temp)} -resize 48x48! #{escape(target)}"

	  )

	end

Part of the beauty of scripting languages is that they were designed out of necessity, so they have ways to glue disparate parts together. In addition, most scripting languages have a rich set of community-developed libraries available. Though CPAN (Perl's collection of third-party libraries) is the undisputed champion in this arena, Ruby has Rubyforge (http://rubyforge.org) and the Ruby Application Archive (http://raa.ruby-lang.org/).

Writing inline C code

Writing Ruby extensions in C used to be hard. If you wanted to rewrite performance-sensitive functions, there were many things besides the actual code that you had to deal with. Not so anymore.
Ryan Davis has unleashed an incredible tool, RubyInline, for integrating C with Ruby. This tool allows you to embed C/C++ code as strings directly within an application. The strings are then compiled into native code (only to be recompiled when they change) and installed into your classes. The canonical example, the factorial function, shows just how fast and clean this can be:
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Further Reading
Inhaltsvorschau
Zed Shaw's most famous rant, Programmers Need To Learn Statistics Or I Will Kill Them All (http://www.zedshaw.com/rants/programmer_stats.html), is an excellent (if a little aggressive) description of the most common misconceptions surrounding performance measurement.
Peepcode has a screencast on benchmarking with httperf at http://peepcode.com/ products/benchmarking-with-httperf. It is $9 but is worth the cost for anyone involved in performance tuning.
Evan Weaver has a set of MySQL configuration files that are tuned for common Rails situations at http://blog.evanweaver.com/articles/2007/04/30/top-secret-tuned-mysql-configurations-for-rails. These are drop-in replacements for the standard my.cnf configuration file, and they are much more current than the examples provided with MySQL.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Chapter 7: REST, Resources, and Web Services
Inhaltsvorschau
There are only two hard things in Computer Science: cache invalidation and naming things.
—Phil Karlton
The architectural principles of Representational State Transfer, or REST, have been taking the Rails world by storm. The idea behind REST has been around since Roy Fielding first described it in his 2000 doctoral dissertation. However, the ideas have only started to gain traction among Rails developers since David Heinemeier Hansson's presentation of those ideas in 2006 and the subsequent adoption of RESTful principles in Rails 1.2. RESTful design is a new way of thinking about network architecture based on an observation of how the Web works.
In short, REST is a unifying theory for how "distributed hypermedia" systems (primarily, the World Wide Web) are best organized and structured. The term was coined by Roy Fielding, coauthor of the HTTP specification, in his 2000 doctoral dissertation Architectural Styles and the Design of Network-Based Software Architectures. The dissertation extracts a set of principles that are common to network architectures, based on an examination of the structure of the Web and the HTTP protocol. Starting with the "null style," which is the absence of constraints on architecture, Fielding arrives at REST by placing a series of constraints on network architecture:
Client-Server
The client-server constraint imposes a separation of data storage from user interface and presentation. The most important benefit of this separation is that client and server can exist in separate organizations and be maintained, developed, and scaled independently.
Stateless
The server may not hold persistent state about its sessions with the client. Each request from client to server is independent and self-contained. This increases verbosity, but aids reliability and scalability. When there is little or no stored context on a server, the system is more resilient to periodic failure, and there are fewer requirements for inter-server coordination when the system is scaled up.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
What Is REST?
Inhaltsvorschau
In short, REST is a unifying theory for how "distributed hypermedia" systems (primarily, the World Wide Web) are best organized and structured. The term was coined by Roy Fielding, coauthor of the HTTP specification, in his 2000 doctoral dissertation Architectural Styles and the Design of Network-Based Software Architectures. The dissertation extracts a set of principles that are common to network architectures, based on an examination of the structure of the Web and the HTTP protocol. Starting with the "null style," which is the absence of constraints on architecture, Fielding arrives at REST by placing a series of constraints on network architecture:
Client-Server
The client-server constraint imposes a separation of data storage from user interface and presentation. The most important benefit of this separation is that client and server can exist in separate organizations and be maintained, developed, and scaled independently.
Stateless
The server may not hold persistent state about its sessions with the client. Each request from client to server is independent and self-contained. This increases verbosity, but aids reliability and scalability. When there is little or no stored context on a server, the system is more resilient to periodic failure, and there are fewer requirements for inter-server coordination when the system is scaled up.
Cache
This step requires the server to indicate whether or not the client may cache a response, and to define parameters for such caching. Providing explicit cache control information allows the client to cache more aggressively, reducing network traffic and increasing performance.
Uniform Interface
A uniform interface is the primary item that distinguishes REST from RPC and other network styles. Forcing the client and server to communicate using a wellknown uniform interface pushes the application-specific complexity out of the network layer into the application layer, where it belongs. It allows standardized software components to be reused for vastly different applications, as they speak the same language.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Benefits of a RESTful Architecture
Inhaltsvorschau
In this chapter, we have touched on some of the benefits that a RESTful application architecture can provide, and hopefully you have seen some of those benefits for yourself. Now we will list and explain each of the major benefits that REST strives to achieve.
The cornerstone of REST is simplicity. The decision to use a standard set of verbs (whether the HTTP verbs or some other set) virtually eliminates an entire area of discussion. The uniform registration and naming system of MIME types certainly doesn't settle the debate, but it definitely simplifies it.
With those two corners of the REST triangle taken care of, potentially the biggest gray area is in identifying and naming resources. Naming is one area where simplicity really pays off, because it is very easy to get it wrong. However, if you stick with a standard set of verbs and content types religiously, they will help constrain your choice of nouns.
It is very important to define clean, readable, persistent URIs for your resources. The REST FAQ (http://rest.blueoxen.net/cgi-bin/wiki.pl?RestFaq) makes a good observation about naming:
GET is restricted to a single-URL line and that sort of enforces a good design principle that everything interesting on the Web should be URL-addressable. If you want to make a system where interesting stuff is not URL-addressable, then you need to justify that decision.
This is what designers and architects mean when they say "constraints are freeing." The principles of REST were derived from examination of how the Web and other hypertext networks actually work. Rather than being some set of arbitrary restrictions, they embody the way that the Web should act.
By working within the principles of REST, any pain you may feel should be treated as a hint that you might be going against the grain of the Web's natural architecture. It is certainly possible that your particular case is a special one. Certain application domains just do not fit well into the REST paradigm. (REST has been described as "Turing-complete" in a parallel with programming languages. Though any application may be expressed in terms of REST, some may be much more conducive to REST than others.) But trying to push yourself into the REST paradigm forces you to defend any exceptions and special cases, and in doing so you may find that the exceptions were not necessary after all.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
RESTful Rails
Inhaltsvorschau
At RailsConf 2006, David Heinemeier Hansson's keynote marked the beginning of the RESTful philosophy becoming mainstream in Rails. The keynote, Discovering a World of Resources on Rails, presented a roadmap for moving Rails 1.2 toward a more RESTful, CRUD-based default.
One of the key points in the presentation was that resources can be more than things we might think of as "objects"; examples given were relationships between objects, events on objects, and object states. This is an important principle in REST. Rather than adding another method #close on a Case object, it may be more clear to factor out a Closure object if more information about the closure needs to be persisted.
From the outside of an application, the most visible change in RESTful Rails is the new routing. Classic Rails routing was based around the default route of /:controller/:action/:id, with any extra parameters usually being carried in the query string. This had advantages of simplicity and uniformity in routing, but it was brittle to change. Refactoring actions from one controller to another required updating all links pointing to that action; they had to be changed from:

	link_to 'Close', :controller => 'cases', :action => 'close', :id => @case.id

to:

	link_to 'Close', :controller => 'closures', :action => 'create',

	                 :id => @case.id

The next major innovation in Rails routing was the prevalence of named routes. By associating each URI with a name, you would get an easy way to refactor route URLs without changing the inward links. This provided another layer of abstraction on top of the actual URI parameters:
config/routes.rb

	map.close_case '/cases/close/:id', :controller => 'cases', :action => 'close'

_case.rhtml

	link_to 'Close', close_case_path(:id => @case.id)

Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Case Study: Amazon S3
Inhaltsvorschau
Amazon S3 (Simple Storage Service) is an online file-storage web service provided by Amazon. It is unique among online storage services in several ways:
  • It has a no-minimum pricing structure. Storage is billed by the GB-month, bandwidth is billed by the GB, and there is an additional charge per GET, PUT, and LIST request.
  • There is no web interface to create objects; the only full mode of access is through the API.
  • It is generally agreed that the S3 API is the first large public API that calls itself RESTful and actually lives up to the principles of REST.
  • In addition to the rich HTTP web service interface, S3 can serve objects over plain HTTP (without any custom HTTP headers) and BitTorrent. Many organizations use S3 as a storage network for their static content because it can serve images, CSS, and JavaScript just as well as a standard web server.
The full documentation for the S3 API is at http://aws.amazon.com/s3. We will now look into the basic architecture of S3, its concepts, and its set of operations.
S3 is used to store objects, which are streams of data with a key (a name) and attached metadata. They are like files in many ways. Objects are stored in buckets, which also have a key. Buckets are like filesystem directories, with a few differences:
  • Bucket names must be unique across the entire S3 system. You cannot pick a bucket name that has already been chosen by someone else.
  • Bucket names must be valid DNS names (alphanumeric plus underscore, period, and dash).
  • Buckets cannot be nested. There is one level of buckets, which contain objects. However, we can fake such nesting by giving objects keys like blog/2007/01/05/index.html. Slash characters, though they often designate hierarchy in URIs, are treated like any other character in object keys. We can even query keys by prefix, so we can ask to list keys starting with
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Further Reading
Inhaltsvorschau
Roy Fielding's dissertation, Architectural Styles and the Design of Network-Based Software Architectures, is available online from http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm.
The REST wiki is full of theoretical as well as practical guidance about the principles of REST: http://rest.blueoxen.net/.
The HTTP/1.1 specification, RFC 2616, is fairly accessible for the working web developer. Every web application developer should at least be conversant in HTTP. An HTML version of the RFC is available from http://www.w3.org/Protocols/rfc2616/ rfc2616.html.
Leonard Richardson and Sam Ruby's RESTful Web Services (O'Reilly) is a very accessible, yet comprehensive, introduction to the principles of RESTful design. Although it is oriented toward machine-consumable web services, the principles of REST are generally applicable to any network architecture.
Software architecture has a surprising amount in common with building architecture. For a different perspective on software architecture, Christopher Alexander's classic trilogy (The Timeless Way of Building, A Pattern Language, and The Oregon Experiment) is worth a read. The books describe how architecture influences and is influenced by life. Alexander's philosophies on architecture were the inspiration for the modern software design patterns movement.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Chapter 8: i18n and L10n
Inhaltsvorschau
Wer fremde Sprachen nicht kennt, weiβ nichts von seiner eigenen. (He who ignores foreign languages knows nothing of his own.)
—Goethe
As the reach of the Web expands, developers find that their web applications must be customized to match the needs of new audiences of different cultures. Internationalization is the process of adapting software so that it may be used across many various cultures and locales. Localization is the process of actually modifying the product and creating a version customized for a particular language, country, or locale.
The difference between internationalization and localization can be fuzzy, and it can change from situation to situation. As a simplistic example, consider a social networking site. At a minimum, internationalization would involve adapting the application to accept and display data in a wide variety of character sets (say, by using UTF-8 for all input, output, and storage). Localization would at least involve translation of user interface elements to several languages, and possibly much more.
The term internationalization is usually abbreviated i18n, short for "i, 18 letters, and then n." Similarly, "localization" is abbreviated L10n. To avoid ambiguity, i18n is always written with a lowercase i, while L10n always uses an uppercase L. I will use this convention throughout this chapter.
Although language translation gets the lion's share of attention in this field, it is but one part of i18n. A human language may have significant regional differences or variants between countries where the language is spoken. Dialects aside, there can be large differences in currency, collation (sort order), number and date format, and even writing system across regional or political divisions within a country.
These differences are encapsulated in the concept of locale. A locale is usually defined as a language plus a country or region. It includes not only language but also regional and local preferences and possibly a character encoding. A POSIX-style locale identifier looks like
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Locale
Inhaltsvorschau
Although language translation gets the lion's share of attention in this field, it is but one part of i18n. A human language may have significant regional differences or variants between countries where the language is spoken. Dialects aside, there can be large differences in currency, collation (sort order), number and date format, and even writing system across regional or political divisions within a country.
These differences are encapsulated in the concept of locale. A locale is usually defined as a language plus a country or region. It includes not only language but also regional and local preferences and possibly a character encoding. A POSIX-style locale identifier looks like en_US.UTF-8 (English, United States, UTF-8 character encoding).
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Character Encodings
Inhaltsvorschau
One of the most fundamental topics in i18n is the concept of a character encoding or character set. Computers work with numbers; people work with characters. A character encoding maps one to the other. This is simple enough. The difficulty comes, as it usually does, because of history.
At the time of this writing, ASCII is nearing its 45th birthday; yet we still see its legacy today. This should not surprise anyone; data is usually the most longlived part of a computing system. As networking protocols and storage formats are built on top of a character encoding, it should not be a surprise that the character encoding would be among the most deeply entrenched and hardest to change parts of a protocol stack.
ASCII, the American Standard Code for Information Interchange, was one of the first character encodings to gain widespread use; it was introduced in 1963 and first standardized in 1967. Most encodings in use today descend from ASCII.
The ASCII standard (ANSI X3.4-1986) defines 128 characters. The first 32 characters (with hex values 0 through 1F) and the last character (7F) are nonprinting control characters. The remainder (20 through 7E) are printable. The control characters have largely lost their original meaning, but the printable characters are nearly always the same. The standard ASCII table is as follows.

   x0  x1  x2  x3  x4  x5  x6  x7  x8  x9  xA  xB  xC xD xE xF

0x NUL SOH STX ETX EOT ENQ ACK BEL BS  HT  LF  VT  FF CR SO SI

1x DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM  SUB ESC FS GS RS US

2x     !   "   #   $   %   &   '   (   )   *   +   ,   -  .  /

3x 0   1   2   3   4   5   6   7   8   9   :   ;   <   =  >  ?

4x @   A   B   C   D   E   F   G   H   I   J   K   L   M  N  O

5x P   Q   R   S   T   U   V   W   X   Y   Z   [   \   ]  ^  _

6x `   a   b   c   d   e   f   g   h   i   j   k   l   m  n  o

7x p   q   r   s   t   u   v   w   x   y   z   {   |   }  ~ DEL

Extended ASCII

Although ASCII defines 128 characters and a 7-bit encoding, most computers process data in 8-bit bytes. This leaves room for 128 more characters. Of course, computer vendors each chose their own way to deal with this situation. This led to the development of numerous
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Unicode
Inhaltsvorschau
The extended-ASCII model was successful for many years, and the ISO-8859 encodings provided a good way to support different world scripts. However, the limitations became increasingly bothersome; multiple languages could not be supported within one document, and the CJKV languages had their own independently developed character sets and encodings. In addition, the Internet began to develop in the 1990s, connecting people and allowing them to exchange digital information with a far greater reach than before.
So, in 1991, the Unicode Consortium published the first Unicode standard. Unicode sought to be the "one true character set" in which all text would eventually be represented. In a large part, that goal is well on the way to being accomplished. Unicode is a widely known, well-supported standard that is used extensively on the Internet and in other forms of data exchange today.
Unicode supports all of the world's writing systems currently in use and many archaic ones, with very few exceptions. There is no "code page" switching as there was under the old character-set systems. All of the scripts can be used interchange-ably within a document, and the encodings are universal; they can be exchanged over the Internet without worrying too much about differing encodings.
Unicode deals with the world in Platonic ideals. Rather than representing glyphs (the rendering of a character), each Unicode code point represents a grapheme (the character abstracted from its representation). This is consistent with the purpose of a character encoding: to encode text without specifying presentation. For example, the following two characters are the same grapheme and would be represented by the same Unicode code point (U+0061, LATIN SMALL LETTER A), even though they are different glyphs (see ).
Figure : Alternative glyphs representing the "a" grapheme
Though the distinction between graphemes and glyphs is relatively easy to make for English, it can be very difficult and occasionally political for Han characters (the ideographs common to CJKV languages).
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Rails and Unicode
Inhaltsvorschau
Ruby 1.8 has less-than-ideal Unicode support, when compared to its contemporaries such as Java and the .NET languages. To Ruby, strings are just sequences of 8-bit bytes, while the character and string types of the Java runtime and .NETCLR are based on Unicode code points. While Ruby's approach simplifies the language, most developers at this point in time need Unicode support. Luckily, Ruby is flexible enough that we can tack support for Unicode onto the language in a relatively friendly way.
It is not surprising that Ruby's Unicode support is lacking. During the time of Ruby's genesis in Japan (the mid-1990s), Unicode was first being developed. In Unicode's early stages, its supporters were mainly American and European, with less East Asian involvement.
Many Japanese people opposed the process of Han unification, or collapsing most of the Han characters common to CJKV languages into a single set of code points. The unified Han characters tended to appeal more to Chinese speakers than Japanese speakers. The people involved in Han unification (primarily Westerners) tended to collapse characters that were similar, but not identical, across Asian languages. In the early days of Unicode, rendering software would get confused and display similar, but incorrect, glyphs for the Han-unified characters. This was at best disconcerting; at worst, offensive.
There are technical solutions to all of these problems today, but Unicode was a slow starter in Japan. Other character sets such as Shift_JIS gained more currency in Japan at the time, which actually may have contributed somewhat to the problem; having more extant character sets leads to more conversion issues.
Ruby 1.9 will support multilingualization (m17n). Rather than a built-in Unicode assumption, Ruby 1.9 will support interoperability between multiple character sets. This is more flexible than assuming that all string literals are Unicode, and it is a more general approach to character set handling. To use UTF-8 for all string and regex literals, the following pragma can be used:
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Rails L10n
Inhaltsvorschau
For an application to be truly ready for worldwide visitors, internationalization is just the beginning. It is vital for an application with global reach to correctly accept, process, and store UTF-8 data. But it is also important, when supporting users from different regions and locales, to localize the interface and any applicable data to the users' locales. This can involve any of several things, which we will cover in this section.
The way the term "localization" is most often used, it refers to translating interface text and resources into users' languages. The traditional software package used for localizing interface text is GNU gettext.

gettext

gettext uses literal strings from the program's source as keys; translators write files that provide translations for each of the strings. There are several steps to using gettext in an application. We will use Ruby-Gettext, which is a mostly compatible Ruby version of GNU gettext.
First, we install the gettext gem:

	$ sudo gem install gettext

	Successfully installed gettext-1.10.0

Next, we create a very basic skeleton application that loads the gettext gem, binds to the text domain (application name) hello, and displays a greeting:
hello.rb

	#!/usr/local/bin/ruby -w



	require 'rubygems'

	require 'gettext'



	include GetText

	bindtextdomain('hello')



	puts _("Hello, world!")

The _() function is gettext's standard method for localization. All literal text that is to be localized should be wrapped in a call to this method. Our locale is set to U.S. English, so upon running the program, we see the default U.S. English version without having to do any localization:

	$ echo $LC_CTYPE

	en_US.UTF-8

	$ ./hello.rb

	Hello, world!

The developer now creates a .pot file from the source. This extracts all text to be translated from the program and puts it in a template, which the translator will work from. The GNU gettext program to create
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Further Reading
Inhaltsvorschau
Sven Fuchs has a great write-up about Globalize on his blog (http://www.artweb-design.de/2006/11/10/get-on-rails-with-globalize-comprehensive-writeup). The Globalize site (http://www.globalize-rails.org) has plenty of information on setting up Globalize.
There is a mailing list for developers involved with internationalization in Rails. The list information page is available at http://rubyforge.org/mailman/listinfo/railsi18n-discussion.
The Ruby on Rails wiki has a page with good coverage of the current i18n options at http://wiki.rubyonrails.com/rails/pages/InternationalizationComparison.
Figure : Spanish translation: viewing a newly created person
Figure : Spanish translation: viewing all people; one created
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Chapter 9: Incorporating and Extending Rails
Inhaltsvorschau
The best way to predict the future is to invent it.
—Alan Kay
Ruby on Rails was designed as a loosely coupled set of components (ActionPack, ActiveRecord, ActiveResource, ActiveSupport, and ActionMailer) with some glue to hold them together (RailTies). Although Rails is typically used as a framework (an environment specialized to programming web applications), the components of Rails can be replaced with other components more suitable to a project. Alternatively, the components can be broken out and used apart from the rest of Rails. In this chapter, we will see how these techniques can be used for maximum flexibility in application development.
ActiveRecord, the Rails object-relational mapper, is one of the best-known parts of the Rails framework. But it represents one of many valid ways to map objects to a database. Martin Fowler identified and defined the Active Record pattern, along with other data-source patterns, in his book Patterns of Enterprise Application Architecture. (The Active Record pattern should not be confused with the ActiveRecord library, which is based on that pattern.) Several Ruby libraries have been developed based on other patterns. We will look at DataMapper, based on the pattern of the same name. We will also examine Ambition, an off-the-wall experimental library that maps Ruby statements directly to SQL.
If you are not using ActiveRecord in a Rails application, you can disable it by removing it from config.frameworks in config/environment.rb:

	config.frameworks -= [ :active_record ]

DataMapper

The DataMapper library (http://www.datamapper.org/ is based on Martin Fowler's Data Mapper pattern, which is similar to Active Record but with less coupling. Active Record's chief structural weakness is that it ties the database schema to the object model. We see this happen in Rails when using ActiveRecord. Every structural change we want to make to the objects must be reflected in the database at the same time.
The Data Mapper pattern provides a better balance when the object model and data-base need to evolve separately. The drawback is that there will be some duplication. Because of the additional layer of indirection, DataMapper cannot infer your object's structure from the database like ActiveRecord can. This is the necessary price of flexibility.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Replacing Rails Components
Inhaltsvorschau
ActiveRecord, the Rails object-relational mapper, is one of the best-known parts of the Rails framework. But it represents one of many valid ways to map objects to a database. Martin Fowler identified and defined the Active Record pattern, along with other data-source patterns, in his book Patterns of Enterprise Application Architecture. (The Active Record pattern should not be confused with the ActiveRecord library, which is based on that pattern.) Several Ruby libraries have been developed based on other patterns. We will look at DataMapper, based on the pattern of the same name. We will also examine Ambition, an off-the-wall experimental library that maps Ruby statements directly to SQL.
If you are not using ActiveRecord in a Rails application, you can disable it by removing it from config.frameworks in config/environment.rb:

	config.frameworks -= [ :active_record ]

DataMapper

The DataMapper library (http://www.datamapper.org/ is based on Martin Fowler's Data Mapper pattern, which is similar to Active Record but with less coupling. Active Record's chief structural weakness is that it ties the database schema to the object model. We see this happen in Rails when using ActiveRecord. Every structural change we want to make to the objects must be reflected in the database at the same time.
The Data Mapper pattern provides a better balance when the object model and data-base need to evolve separately. The drawback is that there will be some duplication. Because of the additional layer of indirection, DataMapper cannot infer your object's structure from the database like ActiveRecord can. This is the necessary price of flexibility.
DataMapper confers several other advantages over ActiveRecord:
  • DataMapper includes an implementation of the Identity Map pattern, which ensures that each database record is loaded only once. ActiveRecord will happily allow a record to be loaded many times, which can potentially cause conflicts when one becomes stale.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Incorporating Rails Components
Inhaltsvorschau
As Rails is built up of many modular components, these components can be used individually just as they can be used as a framework. Here we will see how the pieces that make up Rails can be used in other Ruby code. We will walk through two modular components of Rails, ActiveRecord and ActionMailer, and see how to use them in standalone applications.
ActiveRecord is perhaps the easiest component to decouple from the rest of Rails, as it fulfills a purpose (object-relational mapping) that can be used in many different places. The basic procedure for loading ActiveRecord is simple; just define the connection, and then create the classes that inherit from ActiveRecord::Base:

	require 'rubygems'

	require 'active_record'



	ActiveRecord::Base.establish_connection(

	  # connection hash

	)



	class Something < ActiveRecord::Base # DB table: somethings

	end

The establish_connection function takes a hash of parameters needed to set up the connection. This hash is the same one that is loaded from database.yml when using Rails, so you could just pick up that file and load it:

	require 'yaml' # Ruby standard library

	ActiveRecord::Base.establish_connection(YAML.load_file('database.yml'))

If you are used to the features of edge Rails, you may not want to stick with the latest gem version of ActiveRecord. To use the latest edge, first check out ActiveRecord's trunk from Subversion:

	$ svn co http://svn.rubyonrails.org/rails/trunk/activerecord \

	         vendor/activerecord

Then, just require the active_record.rb file from that directory:

	require 'vendor/activerecord/lib/active_record'

ETL operations

ActiveRecord can be a useful tool to load data into and extract data from databases. It can be used for anything from one-off migration scripts to hourly data transformation jobs. The following is a representative example, using James Edward Gray II's FasterCSV library:

	require 'rubygems'

	require 'fastercsv' # gem install fastercsv

	require 'active_record'



	# Set up AR connection and define User class

	ActiveRecord::Base.establish_connection(

	  # (connection spec)...

	)



	class User

	  # The table we're importing into doesn't use Rails conventions,

	  # so we'll override some defaults.

	  set_table_name 'user'

	  set_primary_key 'userid'

	end



	FasterCSV.foreach('users.csv', :headers => true) do |row|

	  # The CSV header fields correspond to the database column names,

	  # so we can do this directly, with no mapping.

	  User.create! row.to_hash

	end

Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Contributing to Rails
Inhaltsvorschau
Rails, as an open source framework, benefits greatly from contributions from the community. Rails incorporates code from hundreds of developers, not just the dozen or so on the core team. Writing code to expand, extend, or fix Rails is often the best way to learn about its internals.
Of course, not all functionality belongs in Rails itself. Rails is an opinionated frame-work, so there are some defaults that may not be useful to everyone. The plugin system was designed so that Rails would not have to incorporate every feature that is useful to someone. Refer to for information on writing plugins to extend Rails; it is only minimally more work than patching the Rails codebase.
There are several reasons that useful features are rejected from Rails in favor of being plugins. The primary reason for rejection is that the feature is too specific; it would not be useful to most Rails developers. Alternatively, it may be contrary to the "opinion" of the framework. However, features may be rejected simply because there are many valid ways of accomplishing one goal, and it does not make sense to default to one. Some common areas of functionality that have been repeatedly discussed and rejected from Rails core are the following:
Engines
David Heinemeier Hansson's rejection of high-level components in Rails is a topic that has generated much more heat than light. Rails engines (http://rails-engines.org) are full-stack (model, view, and controller) components that can be incorporated into larger applications; in effect, they augment the plugin system to structure the sharing of model, view, and controller code.
The trouble with engines comes when they are treated as high-level components, as if dropping a content-management-system engine into an application will accomplish 90% of the CMS functionality a particular project needs. In many cases, the work required to integrate such a high-level component into an existing application outweighs the benefits of not writing the component from scratch.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Further Reading
Inhaltsvorschau
Railscasts has produced a screencast detailing the process of contributing to Rails. It is available at http://railscasts.com/episodes/50.
Notes from Josh Susser's talk on contributing to Rails are posted at http://edgibbs.com/ 2007/04/23/josh-susser-on-contributing-to-rails/.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Chapter 10: Large Projects
Inhaltsvorschau
Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it.
—Alan Perlis
This chapter introduces several concepts that are related to deploying large applications in general, and Rails applications in particular. These are valuable concepts for any project, regardless of the framework being used.
For all but the tiniest of projects, version control is non-negotiable. Version control is like a time machine for a project; it aids in collaboration, troubleshooting, release management, and even systems administration. Even for a solo developer working on a small project on one workstation, the ability to go back in time across a codebase is one of the most valuable things to have.
There are two primary models for version control systems: centralized and decentralized. Though the former is the most widely known, the latter is steadily gaining in popularity and has some amazing capabilities.
Centralized version control is the most popular model, and perhaps the easiest to understand. In this model, there is a central repository, operated by the project administrators. This repository keeps a virtual filesystem and a history of the changes made to that filesystem over time.
illustrates the typical working model used for centralized development.
Figure : Centralized version control
A developer follows this basic procedure to work with a version control system:
  1. Create a working copy (a local copy of the code for development) by performing a checkout. This downloads the latest revision of the code.
  2. Work on the code locally. Periodically issue update commands, which will retrieve any changes that have been made to the repository since the last checkout or update. These changes can usually be merged automatically, but sometimes manual intervention is required.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Version Control
Inhaltsvorschau
For all but the tiniest of projects, version control is non-negotiable. Version control is like a time machine for a project; it aids in collaboration, troubleshooting, release management, and even systems administration. Even for a solo developer working on a small project on one workstation, the ability to go back in time across a codebase is one of the most valuable things to have.
There are two primary models for version control systems: centralized and decentralized. Though the former is the most widely known, the latter is steadily gaining in popularity and has some amazing capabilities.
Centralized version control is the most popular model, and perhaps the easiest to understand. In this model, there is a central repository, operated by the project administrators. This repository keeps a virtual filesystem and a history of the changes made to that filesystem over time.
illustrates the typical working model used for centralized development.
Figure : Centralized version control
A developer follows this basic procedure to work with a version control system:
  1. Create a working copy (a local copy of the code for development) by performing a checkout. This downloads the latest revision of the code.
  2. Work on the code locally. Periodically issue update commands, which will retrieve any changes that have been made to the repository since the last checkout or update. These changes can usually be merged automatically, but sometimes manual intervention is required.
  3. When the unit of work is complete, perform a commit to send the changes to the repository. Repeat from step 2, as you already have a working copy.

CVS

The Concurrent Versions System (CVS, http://www.nongnu.org/cvs/) is the oldest version control system still in common use. Although Subversion is generally favored as its replacement, CVS pioneered several features considered essential to centralized version systems, such as the following:
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Issue Tracking
Inhaltsvorschau
Issue-tracking systems are essential to any large or long-lived project. The term "issue" is broad enough to encompass things that may not be thought of as bugs or defects: feature requests, work orders, support requests, or even planning documents for future changes to an application.
The difference between products called "issue trackers" and those called "bug trackers" is largely one of focus; the two typically implement similar sets of features. Issue trackers tend to be customer-oriented; even if only used by employees, each ticket represents a customer problem. Bug trackers tend to be focused more on the product; they collect bugs, feature requests, or other issues regarding the project. One distinguishing factor is that under a bug tracker, multiple tickets representing the same issue will usually be folded into one ticket, even if the tickets affect different customers.
One powerful feature that some issue trackers offer is integration with a version con-trol system. This allows the history of each issue to be correlated with the development of the code. Patches intended to fix an issue can reference the issue number directly. Conversely, issues can reference version control changesets (for example, "fixed in r1843"), and the issue-tracking system provides the changesets in a friendly format (such as HTML diff).
Some of the most popular issue-tracking systems are listed here.
Product
Platform
Description
Bugzilla
Perl, Apache or IIS
Mozilla project's bug tracker. Oriented toward open source software development. Flexible workflow.
Collaboa
Rails
Newcomer to the industry; development trunk is still fairly unstable. Takes the best features from Trac and cleans them up a bit.Looks very promising.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Project Structure
Inhaltsvorschau
There are several decisions that must be made about how to structure a large Rails application. Issues arise with how to manage multiple branches of development, a team of developers, and external or vendor software. In this section, we cover some of the most common choices.
Subversion usually needs a little bit of configuration to work with Rails. There are some "volatile" files that change from development to production or within a deployment. These files should be kept out of version control. In Subversion, a file is ignored within a directory by setting a pattern matching the file as the value of the svn:ignore property on the parent directory. For most Rails applications, the following ignores are typically used:

	$ svn propset svn:ignore database.yml config/

	$ svn propset svn:ignore "*" log/ tmp/{cache,pids,sessions,sockets}

There is a Subversion client configuration that sets up many of these settings, and will ignore those volatile files without the need for svn:ignore. It also sets up autoprops, which sets the MIME type on files in the repository automatically. If you work mainly with Rails projects, this can be a good choice. The config file is available from http://3spoken.wordpress.com/rails-subversion-tng-config-file.
As a rule, configuration specific to a particular Rails environment (excluding database connection specifications, which are more specific to the developer and his environment) should not be ignored, but rather should be placed in environment-specific blocks. This allows the configuration to be versioned while still remaining environment-specific.

Importing existing applications

The svn importcommand is designed to place a directory (and its subdirectories) under version control. Unfortunately, it is only an import, not a checkout. It does not turn the imported directory into a working copy, which is usually the behavior you want when importing a project that is already under development.
There is a neat Subversion trick to add an existing directory tree "in place" to an empty repository. You can use this when putting an existing application under version control:
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Rails Deployment
Inhaltsvorschau
As a full-stack web framework, Rails can require some work to deploy an application from the ground up. Rails, unfortunately, has a bad reputation for being hard to deploy, mainly due to problems with the preferred deployment environments when Rails was young (2004–2005). But Rails has grown up, and Mongrel came along in 2006 and made things much easier. There are now good sets of best practices for deploying Rails applications, from the smallest development environments to huge multi-data-center worldwide clusters.
One of the most basic concerns when deploying any web application is scalability: how well the underlying architecture can respond to increased traffic. The canonical Rails answer to the scalability question is shared-nothing(which really means shared-database): design the system so that nearly any bottleneck can be removed by adding hardware. The standard architecture looks like .
Figure : Simple shared-nothing deployment environment
The interface to the application is either a light web server (operating as a reverse proxy balancer) or a hardware load balancer. A small web server is usually used to handle the static files (images, JavaScript, static HTML, stylesheets, and the like) because a single-purpose static file server is much faster than an application server at serving static files. This front end box delegates dynamic requests to one of the appli-cation servers, selected either randomly or based on load.
For redundancy in large setups, two front end servers can be used, on separate machines, proxying to the same set of application servers (see ).
If high availability is required, the load balancers must use a VIP-/VRRP-based solution to ensure that the cluster will always respond to all of its IP addresses even under the failure of one load balancer. If high availability is not a requirement, primitive load balancing will suffice, by giving each load balancer its own IP address and exposing them all through a DNS RR (round-robin) record.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
Further Reading
Inhaltsvorschau
There are many resources, both free and paid, for learning the version control systems mentioned in this chapter. For Subversion, there is Version Control with Subversion (http://svnbook.red-bean.com), which is available both for free online and as a print book from O'Reilly. Also available is Pragmatic Version Control Using Subversion (http://pragmaticprogrammer.com/titles/svn2/index.html), which is more of a tutorial than a reference.
CVS has similar options available. The book Open Source Development with CVS is available under a GPL license online at http://cvsbook.red-bean.com. The print book is also distributed by O'Reilly. The Pragmatic Programmers' offering is Pragmatic Version Control Using CVS (information available at http://pragmaticprogrammer.com/starter_kit/vcc/index.html).
Similarly, the best book about Mercurial is free. It can be downloaded from http:// hgbook.red-bean.com.
Matt Pelletier and Zed Shaw have written a book on Ruby application deployment with Mongrel; it can be purchased and downloaded as a PDF from http://www.awprofessional.com/bookstore/product.asp?isbn=0321483502&rl=1.
Ezra Zygmuntowicz is writing the book on Rails deployment. Information is avail-able at http://www.pragmaticprogrammer.com/titles/fr_deploy/index.html.
Ende der Inhaltsvorschau. Der weiterere Inhalt dieses Abschnitts ist hier nicht einsehbar.
	

Zurück zu Advanced Rails


Themen

Buchreihen

Special Interest

International Sites

O'Reilly China O'Reilly USA O'Reilly Japan O'Reilly Taiwan