Content area
Abstract
Ruby is a pure, untyped, object-oriented language with a full metaclass model, iterators, closures, and reflection. The freely available language is easy to learn thanks to a simple syntax and transparent semantics.
Full text
A freely available pure object-oriented language
Take the pure object orientation of Smalltalk, but remove the quirky syntax and reliance on a workspace. Add in the convenience and power of Perl, but without all the special cases and magic conversions. Wrap it up in a clean syntax based in part on Eiffel, and add a few concepts from Scheme, CLU, Sather, and Common Lisp. You end up with Ruby.
Thanks in part to the energy of its creator, Yukihiro Matsumoto (Matz), Ruby is already more popular than Python in its native Japan. Ruby is a pure, untyped, objectoriented language-just about everything in Ruby is an object, and object references are not typed. People who enjoy exploring different OO programming paradigms will enjoy experimenting with Ruby: It has a full metaclass model, iterators, closures, reflection, and supports the run-time extension of both classes and individual objects.
The freely available Ruby (http://www .ruby-lang.org/) is being used worldwide for text processing, XML and web applications, GUI building, in middle-tier servers, and general system administration. Ruby is used in artificial intelligence and machine-learning research, and as an engine for exploratory mathematics.
Ruby's simple syntax and transparent semantics make it easy to learn. Its direct execution model and dynamic typing let you develop code incrementally: You can typically add a feature and then try it immediately, with no need for scaffolding code. Ruby programs are typically more concise than their Perl, Python, or C++ counterparts, and their simplicity makes them easier to understand and maintain. When you bump up against some facility that Ruby is lacking, you'll find it easy to write Ruby extensions using both Ruby and low-level C code that adds new features to the language.
We came across Ruby when we were looking for a language to use as a prototyping and specification tool. We've used it on all of our projects since. We have Ruby code performing distributed logging, executing within an X Windows window manager, precompiling the text of a book, and generating indexes. Ruby has become our language of choice.
Everything's an Object
Everything you manipulate in Ruby is an object, and all methods are invoked in the context of an object. (In our examples here, we'll sometimes show the result of evaluating an expression to the right of an arrow (->). This is not part of the Ruby syntax.)
.gin joint"."length --> 9
"Rick".index("c") -->4 2
-1942.abs -->1942
sam.play(aSong) ---> "duh dum, da dum de dum..."
In Ruby and Smalltalk jargon, all method calls are actually messages sent to an object. Here, the thing before the period is called the "receiver," and the name after the period is the method to be invoked.
The first example asks a string for its length, and the second asks a different string to find the index of the letter "c." The third line has a number calculate its absolute value. Finally, we ask the object "sam" to play us a song. It's worth noting a major difference between Ruby and most other languages. In Java, for example, you'd find the absolute value of some number by calling a separate function and passing in that number. In Ruby, the ability to determine absolute values is built into numbers -they take care of the details internally. You simply send the message abs to a number object and let it do the work.
number = Math. abs(number) // Java
number = number. abs // Ruby
The same applies to all Ruby objects: In C, you'd write strlen(name); while in Ruby, it's name.length. This is part of what we mean when we say that Ruby is a genuine 00 language.
The parentheses on method calls are optional unless the result would be ambiguous. This is a big win for parameterless methods, as it cuts down on the clutter generated by all those 0 pairs.
Classes and Methods
As Example 1 shows, Ruby class definitions are remarkably simple: The keyword class is followed by a class name, the class body, and the keyword end to finish it all off. Ruby features single inheritance: Every class has exactly one superclass, which can be specified as in Example 2. A class with no explicit parent is made a child of class Object- the root of the class hierarchy and the only class with no superclass. If you're worried that a single inheritance model just isn't enough, never fear. We'll be talking about Ruby's mix-in capabilities shortly.
Returning to the definition of class Song in Example 1, the class contains two method definitions, initialize and to_s. The initialize method participates in object construction. To create a Ruby object, you send the message new to the object's class, as in the last line of Example 1. This new message allocates an empty, uninitialized object, and then sends the message initialize to that object, passing along any parameters that were originally given to new. This makes initialize roughly equivalent to constructors in C++ and Java.
Class Song also contains the definition of the method to_s. This is a convenience method; Ruby sends to_s to an object whenever it needs to represent that object as a string. By overriding the default implementation of to_s (which is in class Object), you get to control how your objects are printed (for example, by tracing statements and the debugger), and when they are interpolated in strings. In Example 2, we create a subclass of class Song, overriding both the initialize and to s methods. In both of the new methods we use the super keyword to invoke the equivalent method in our parent class. In Ruby, super is not a reference to a parent class; instead, it is an executable statement that reinvokes the current method, skipping any definition in the class of the current object. By default, all methods (apart from initialize) are publicly accessible; they can be invoked by anyone. Ruby also supports private and protected access modifiers, which can be used to restrict the visibility of methods to a particular object or a particular class, respectively. Ruby's implementation of "private" is interesting: You cannot invoke a private method with an explicit receiver, so it may only be called with a receiver of self, the current object.
Attributes, Instance Variables, and Bertrand Meyer
The initialize method in class Song contains the line @title = title. Names that start with single "at" signs (@) are instance variables-variables that are specific to a particular instance or object of a class. In our case, each Song object has its own title, so it makes sense to have that title be an instance variable. Unlike languages such as Java and C++, you don't have to declare your instance variables in Ruby; they spring into existence the first time you reference them. Another difference between Ruby and Java/C++ is that you may not export an object's instance variables; they are available to subclasses, but are otherwise inaccessible. (This is roughly equivalent to Java's "protected" concept.) Instead, Ruby has attributes: methods that get and set the state of an object. You can either write these attribute methods yourself, as in Example 3, or use the Ruby shortcuts in Example 4.
It's interesting to note the method called title= in Example 3. The equals sign tells Ruby that this method can be assigned to -it can appear on the left side of an assignment statement. If you were to write aSong.title = "Chicago,"Ruby translates it into a call to the title= method, passing "Chicago" as a parameter. This may seem like some trivial syntactic sugar, but it's actually a fairly profound feature. You can now write classes with attributes that act as if they were variables, but are actually
method calls. This decouples users of your class from its implementation-you're free to change an attribute back and forth between some algorithmic implementation and a simple instance variable. In ObjectOriented Software Construction (Prentice Hall, 2000), Bertrand Meyer calls this the "Uniform Access Principle."
Blocks and Iterators
Have you ever wanted to write your own control structures, or package up lumps of code within objects? Ruby's block construct lets you do just that. A block is simply a chunk of code between braces, or between do and end keywords. When Ruby comes across a block, it stores the block's code away for later; the block is not executed. In this way, a block is similar to an anonymous method. Blocks can only appear in Ruby source alongside method calls.
A block associated with a method call can be invoked from within that method. This sounds innocuous, but this single facility lets you write callbacks and adaptors, handle transactions, and implement your own iterators. Blocks are also true closures, remembering the context in which they were defined, even if that context has gone out of scope.
The method in Example 5 implements an iterator that returns successive Fibonacci numbers (the series that starts with two is, where each term is the sum of the two preceding terms). The main body of the method is a loop that calculates the terms of the series. The first line in the loop contains the keyword yield, which invokes the block associated with the method, in this case passing as a parameter the next Fibonacci number. When the block returns, the method containing the yield resumes. Thus, in our Fibonacci example, the block will be invoked once for each number in the series until some maximum is reached.
Example 6 shows this in action. The call to fibUpTo has a block associated with it (the code between the braces). This block takes a single parameter- the name between the vertical bars at the start of the block is like a method's parameter list. The body of the block simply prints this value.
If you write your own collection classes (or any classes that implement a stream of values), you can benefit from the real beauty of Ruby's iterators. Say you've produced a class that stores objects in a singly linked list. The method each in Example 7 traverses this list, invoking a block for each node. This is a Visitor Pattern in three lines of code. The choice of the name, each, was not arbitrary. If your class implements an each method, then you can get a whole set of other collection-oriented methods for free, thanks to the Enumerable mix-in.
Blocks and Closures
Ruby blocks can be converted into objects of class Proc. These Proc objects can be stored in variables and passed between methods just like any other object. The code in the corresponding block can be executed at any time by sending the Proc object the message call.
Ruby Proc objects remember the context in which they were created: the local variables, the current object, and so on. When called, they recreate this context for the duration of their execution, even if that context has gone out of scope. Other languages call Proc objects closures.
The following method returns a Proc object:
def times(n)
return Proc.new {|val| n * val} end
The block multiplies the method's parameter, n, by another value, which is passed to the block as a parameter. The following code shows this in action:
double = times(2)
double.call(4) -->48
santa = times("Ho! ")
santa.call(3) --> "Ho! Ho! Ho! "
The parameter n is out of scope when the double and santa objects are called, but its value is still available to the closures.
Modules, Mix-ins, and Multiple Inheritance
Modules are classes that you can't instantiate: You can't use new to create objects from them, and they can't have superclasses. At first, they might seem pointless, but in reality, modules have two major uses. Modules provide namespaces. Constants and class methods may be placed in a module without worrying about their names conflicting with constants and methods in other modules. This is similar to the idea of putting static methods and variables in a Java class. In both Java and Ruby you can write Math.PI to access the value of it (although in Ruby, PI is a constant, rather than a final variable, and you're more likely to see the notation Math::PI).
Modules are also the basis for mix-ins, a mechanism by which you add canned behavior to your classes.
Perhaps the easiest way to think about mix-ins is to imagine that you could write code in a Java interface. Any class that implemented such an interface would receive not just a type signature; it would receive the code that implemented that signature as well. We can investigate this by looking at the Enumerable module, which adds collection-based methods to classes that implement the method each. Enumerable implements the method find (among others). find returns the first member of a collection that meets some criteria. This example shows find in action, looking for the first element in an array that is greater than four. [1,3,5,7,9]-find {|i| i >4}-->5
Class Array does not implement the find method. Instead, it mixes in Enumerable, which implements find in terms of Array's each method; see Example 8. Contrast this approach with both Java and C#, where it is up to the class implementing the collection to also provide a considerable amount of supporting scaffolding.
Although a class may have only one parent class (the single inheritance model), it may mix in any number of modules. This effectively gives Ruby the power of multiple inheritance without some of the ambiguities that can arise. (And in cases where mixing in modules would cause a name clash, Ruby supports method renaming.)
Other Good Stuff
Other Ruby highlights include:
Classes and modules are never closed. You can add to and alter all classes and modules (including those built into Ruby itself).
Dynamic loading. Ruby modules (both source and binary) may be loaded dynamically, both explicitly and on demand Reflection. As well as supporting reflection into both classes and individual objects, Ruby lets you traverse the list of currently active objects.
Marshaling. Ruby objects can be serialized and deserialized, allowing them to be saved externally and transmitted across networks. A full distributed-object system, DRb, is written in about 200 lines of Ruby code.
Libraries. Ruby has a large (and growing) collection of libraries. All major Internet protocols are supported, as are most major databases. Extending Ruby is simple compared to adding extensions to Perl. Threads. Ruby has built-in support for threads, and doesn't rely on the underlying operating system for thread support. Object specialization. You can add methods to individual objects, not just classes. This is useful when defining specialized behavior for objects (for example, determining their response to GUI events). Exceptions. Ruby has a fully objectoriented, extensible exception model. Garbage collection. Ruby objects are automatically garbage collected using a mark-and-sweep algorithm. The choice of mark-and-sweep simplifies programming and makes writing extensions easier (no reference counting problems). Active developer community. The Ruby development community is still a bazaar: small, intimate, and bustling. Changes are discussed openly and are made efficiently.
Some Real Examples
At this point, we'll present two larger Ruby programs. The first is a basic web server that echoes back the headers it receives. It's written as two classes; see Listing One. WebSession is a convenience class that provides two methods for writing to a TCP connection. The standardPage method is interesting. At a minimum, it writes a standard page header and footer. If called with a block, however, it inserts the value returned by that block as the page body. This kind of wrapping functionality is a natural use for Ruby's blocks.
The WebServer class uses Ruby's TCP library to accept incoming connections on a given port. For each connection, it spawns a Ruby thread that reads the header and writes the contents back to the client. The code in the thread is wrapped in a begin/end block, used in Ruby to handle exceptions. In this case, we use an ensure clause to make sure that the connection to the client is closed, even if we encounter errors while handling the request.
The second program packs a number of features into a small space. At its core, it represents the list of songs in an MP3 collection as an array, providing all the existing array functionality plus the ability to shuffle the entries randomly. If the array is sorted, then the entries will be ordered by song title. Each entry in the array is an object of class Song. As well as providing a container for the song title, album, and artist, this class implements the general comparison operator, <=>. This operator is used when sorting containers containing songs: In this case, we compare song titles. There are two common approaches to making our MP3List act as if it were an array: delegation or subclassing. Listing Two shows the approach using delegation. The library module delegate provides a class SimpleDelegator, which handles all the details of forwarding method calls from class MP3List to the delegate. We create the array containing the songs, then invoke SimpleDelegator's initialize method (using super(songlist)) to set up the delegation. From that point on, MP3List will act as if it were an army. When users shuffle a song list, we create a new array containing the entries of the original in a random order, and use SimpleDelegatots __setobj__ method to delegate to that new array.
Listing Three shows an alternative implementation of MP3List in which we subclass the built-in Array class and add our own shuffle! method. Why the exclamation mark? Ruby convention is to append a "!" to methods that change their object (or are otherwise dangerous), and to append a question mark to predicate method names.
Conclusion
Programming in Ruby is an immensely satisfying experience-the language is able to represent high-level concepts concisely, efficiently, and readably. It's easy to learn, and at the same time it is deep enough to excite even the most jaded language collector. Download a copy and try it for yourself.
Dave and Andy are consultants and coauthor of Programming Ruby and The Pragmatic Programmer, both from AddisonWesley. They can be contacted at http:// www.pragmaticprogrammer.com/.
Copyright Miller Freeman Inc. Jan 2001
