Category Archives: Smalltalk

New Public-Facing Website, built with Seaside and GemStone/S

We’ve just helped launch a new public-facing website, and it’s entirely built with GemStone and Seaside. The site is OurCatholicNeighborhood.com, and it provides content-managed pages for Catholic parishes, hospitals, schools, and charitable organizations. The site is supported by advertising, but not in the traditional sense — it’s more like “back of the parish bulletin” advertising. Behind the scenes, there are numerous management and review tools with hierarchy-based access control. This is all found under the “local” area, which is the dominant area of the site at present.

The site is currently running on Seaside 2.8, but we have plans to upgrade to 3.0-rc1 now that GemStone 2.4.4.1 is out. The web front end is lighttpd (naturally), and it integrates with Authorize.net using a cURL binding built in our own custom FFI (very similar to Alien, but hacked for x86_64). We built it in Pharo and deployed to GemStone; the GLASS environment (coupled with our own unit tests) made this pretty easy.

The entire project took one highly-distracted developer (me) 6 months to complete.

In a way, it’s significant to note what we’re not doing in this app:

  1. We’re not using any render caching: every page on the site is built on the fly every time it’s requested.
  2. Search results are not cached; every time a user uses the search bar, we scour the entire object tree looking for matches. Granted, this is a limited search — but for the number of instances we visit (belonging to 9 or 10 different classes), the system is surprisingly snappy. In my experience, SQL queries of this sort are nowhere near this fast.
  3. We’re not using any GemStone performance enhancements; this is an out-of-the-box configuration.
  4. We’re not currently using a paid license; everything we need (for now) fits within the scope of the free web edition.
  5. Static files are not (currently) hosted elsewhere; all images, stylesheets, javascripts, etc. are all served by GemStone. If we need to, we can move these to Amazon S3 in a matter of minutes (ain’t resource URLs handy?)

The site is still very much a work in progress, as all sites are, but I thought it worth publicizing since most of our other Seaside apps are private. As is often the case, the people funding the website development didn’t care what technology was used to implement it — so long as it was up to the task. This gave us an opportunity to throw our favorite set of tools at the problem, and (so far) everyone is extremely happy with the results.

Benedictus Iesus in sanctissimo altaris Sacramento.

5 Comments

Filed under GLASS, Seaside, Smalltalk

Deep SIXX with XMLPullParser

At our company, we develop our GLASS apps in Pharo and then deploy to a GLASS repository on one of our servers, so we sometimes need to copy model objects from one environment to the other. One of our applications also performs regular imports from a third-party database, so we fetch it into a 32-bit Squeak image via ODBC and push it up into the GLASS repository from there.

We needed a platform-independent serialization format, and SIXX fit the bill. It works in Squeak/Pharo right out of the box, and there’s an official GemStone port courtesy of Dale Henrichs and Norbert Hartl.

The only problem we’ve had is that SIXX reading consumes a lot of memory. In Pharo, we have sometimes had to raise the maximum VM heap size. In GemStone, we were bumping up against the default VM temporary object memory ceiling. Dale deals with this issue in general in an excellent blog post. The size limit on temporary object memory is configurable, but the real solution is … well, not use so much temporary memory.

For SIXX in particular, Dale modified the SIXX reader to use a persistent root for storage of SIXX objects during read, and he posted a script[1] to the mailing list that auto-commits when you approach the ceiling. This moves your temporary objects to permanent storage, kind of like using swap space. t’s like using swap space. You’re out of RAM? The OS will save some of your pages to disk and load them on demand.

OK, I know it’s more complicated than that, but that’s the basic idea.

Even using this approach, our ODBC import process was still hitting temporary memory limits. I confess that I didn’t spend much time analyzing the situation. Instead, I decided to throw a new tool at the problem: XMLPullParser.

XMLPullParser

Now before I go further, I should mention that XML is not exactly a passion of mine. When I have spare brain cycles, I don’t spend them on this sort of thing. There are XML-related acronyms that I couldn’t even define for you, much less explain. So if I get some details wrong here, please correct me in case somebody else cares about them.

Antony Blakey built an XML parser for VisualWorks with a pull-based API. He describes it in detail in his blog post, so I won’t go into much detail here. Essentially, your application drives the parsing process (that’s the “pull”) rather than having the parser try to notify you of what it found (“push”). The application pulls events from the parser, where events are things like “tag opened”, “text”, “tag closed”, like having a kid read the XML to you one piece at a time.

“What’s next, Johnny?”
“Uh, </person>”
“OK, if the next is a <politician>, skewer it.”

It’s a depth-first traversal, and it can be done on the fly without first loading the entire DOM. This means that you can read arbitrarily large XML files without high parser overhead.

Now, in Antony’s implementation, he simply wrapped the VisualWorks SAX parser’s output with a stream. This got him the API he wanted, but his hope was to eventually “really pull, without the SAX hack”.

With his permission, I ported XMLPullParser to Squeak, and it’s now available on SqueakSource. In my port, I mashed his work together with the XMLTokenizer class from YAXO, so the Squeak version really does pull.

The implementation is probably incomplete, but it’s parsed everything I’ve thrown at it so far. If you find a missing capability, you can probably just copy a method from XMLTokenizer — simply change senders of “next” to “nextChar”.

There are a few simple test cases in the package, but please don’t look to them for a good example of how to drive the parser. They use the lowest-level “what’s next” API to test the tokenizing only. Real-world usage of the parser involves higher-level operators like match:take:, if:peek:, etc.

parseResponseFrom: stream
  | parser |
  parser := XMLPullParser parse: stream.
  parser next.
  parser match: 'Response'
    take:
      [parser if: 'Errors'
        take:
          [parser while: 'Error' take: [errors add: parser text].
          ^self].
      parser whileAnyTake: [:tag | ... ]].

There are better examples out there, but hopefully this gives a little taste of what the pull parsing API feels like.

Adapting to SIXX

Back the problem at hand: how to attach this to SIXX. The SIXX code is fairly indifferent to the actual XML parser used, with all parser-specific details handled through a subclass of SixxXmlParserAdapter. But the entire SIXX framework expects that you’ll be dealing with fleshed out DOM nodes, so I had no choice but to modify some core parts of SIXX itself.

My goals were to keep the SIXX damage modifications to a minimum, so I had to make some tradeoffs. But with the changes described below, I was able to get all of the SIXX features working except one: truncated XML recovery. And the unit tests indicate that it still works when running against YAXO.

The current version of this SIXX fork is on SqueakSource in the XMLPullParser project.

Initial Results

Let’s get pathological for a few minutes here. I have a 98MB SIXX file (standard SIXX mode, not compact) representing model objects from a small application in Pharo. If we log free space at several points during a simple read, we can tell a little about the actual memory used:

|rs root|
"1" rs := SixxReadStream readOnlyFileNamed: 'models.sixx'
"2" [root := rs next] ensure: [rs close].
"3" rs := nil. "4"

If we take the free space at “1” as our baseline, then we can use the following rough interpretations:

  • Baseline minus free space at “2” is the DOM and stream overhead
  • Baseline minus free space at “3” is the space used by the DOM, root model and stream
  • Baseline minus free space at “4” is the actual memory consumed by the root model we loaded.

Run 1: Pharo/YAXO: DOM and stream overhead is 441 MB (!), the load took 14 minutes on my 2.33GHz Core 2 Duo laptop. It turns out that the root model consumes about 18MB in Pharo. Yes, SIXX in standard mode turns this into 98MB, which is a pretty low signal to noise ratio.

Run 2: Pharo/XMLPullParser: DOM and stream overhead is 2KB, and the load took 17 minutes. We took 3 minutes longer, which may come from the more spotty I/O (we’re not reading the entire file at once) and the extra compare/become phase (see below). But it saved us 440 MB of memory.

One other note on Run 2: The root model consumed a little over 24MB instead of the 18MB it took in Run 1. This is a consequence of the way we build collections in my tweaked version of SIXX; each growable collection has more empty space. More details below.

In GemStone, the test isn’t quite as simple, because the situation isn’t quite so simple.

Run 3: GemStone/YAXO: Dale’s script loads the XML string on the server, then launches the SIXX reader. I ran it and analyzed the memory usage using statmonitor/vsd.

VSD graph: SIXX Load with YAXO

 

What you see here is a graph of the VM temporary memory usage (in red) and auto-commit occurrences (spikes in cyan). The process took almost exactly 10 minutes.

Run 4: GemStone/XMLPullParser: With the XMLPullParser, we can use the same XML string load and auto-commit handler from Dale’s script but replace the guts of the SIXX load with the following: 

rs := SixxReadStream on: (ReadStream on: (UserGlobals at: #SIXX_LOAD_STRING)).
(UserGlobals at: #SIXX_PERSISTENCE_ARRAY) add: rs contextDictionary.
System commitTransaction ifFalse: [ nil error: 'Failed commit - persisting cached objects' ].
rootObject := rs next.

(Putting the SIXX context dictionary in a persistent root is the same trick the current GemStone port uses when you use Object class>>readSixFrom:persistentRoot:. The object graph gets saved 

The statmonitor/vsd analysis now looks like this:

Graph of memory usage with SIXX/XmlPullParser

 

Things started out similarly while we loaded the file, but then memory usage climbed in a much more tame pattern, just as we expected. Auto-commits occur when the size of the model itself is too large to hold entirely in temporary memory. Also, the whole load happened in 9 minutes instead of 10. Why is this? Somebody who knows more about GemStone internals will have to answer specifically, but it no doubt involves the overhead of moving objects back and forth.

Conclusion

The benefits of using this sort of parsing approach are pretty obvious. In both environments, you can load a much larger object graph using SIXX this way without either raising memory ceilings or “swapping” to permanent storage. For my pathological case, the swapping was still necessary but far less of it was needed.

If anyone is interested in the GemStone port of this work, I’ll put it up on GemSource. Since all of my initial work was done in Pharo, and the GemStone port of SIXX has departed from SIXX 0.3 in several key ways, bringing my branch into GemStone has been an adventure. It works for me, but it has a couple of key test failures that I haven’t had a chance to fix yet.

Gory Details

As I mentioned above, SIXX delegates the actual XML element interpretation to a subclass of SixxXmlParserAdapter. Messages sent to SixxXmlUtil class forward to the parser adapter as needed.

This is a good start, but it assumes that you’ve already got fleshed out DOM element nodes in hand. In fact, the entire SIXX architecture expects this, with the parser adapters doing little more than return sub-elements from them, fetch attributes from them, etc.

All of the SIXX methods for instance creation and population take an argument, called “sixxElement”, representing the DOM element in whatever parser framework you use. In my case, I chose to use the entire parser as the sixxElement. The parser knows the current element, so implementation of the forwarders for element name and attribute access were easy enough.

Next, I had to add hooks for tag consumption, essentially letting the SIXX framework indicate when it was done processing a particular tag event. Other parser adapters does nothing with these, but the XMLPullParser adapter advances its stream upon receipt of these messages. There were only a couple of places in the core SIXX framework where I had to hook these in.

SixxReadStream expected to stream over a whole collection of top-level DOM elements, so it had to be replaced. I built a custom SixxXppReadStream and augmented the parser adapter framework to allow for custom read stream classes. SixxXppReadStream allows every operation that SixxReadStream does except #size. Many streams can’t tell you their size anyway, so I didn’t consider this a major loss.

Next, I had to get rid of any place where SIXX asked for all sub-elements of the current node. In most cases, the pattern was something like:

(SixxXmlUtil subElementsFrom: sixxElement)
  do: [:each | ... ]

This was converted to a more stream-friendly pattern of #subElementsFrom:do:, which the XMLPullParser could implement as a further traversal, but other cases weren’t so straightforward.

When SIXX creates an object, it first instantiates it, registers it in a dictionary by ID (for later reference by other objects), then populates it. This lets SIXX deal with circular references, but it creates a problem for on-the-fly creation of collections. In the happy world of fully-populated DOM elements, the creation step can create a collection that’s the proper size by counting sub-elements. Then during the population step, it uses #add: or #at:put: to fill it in.

We don’t have the luxury of being able to look down the DOM tree twice, so in this case I have the instantiation step return an empty collection. If we’re dealing with a growable collection (Set, OrderedCollection, etc) then all is good. But if this is an Array, for example, the population step can optionally return a different object — the real object. If we detect that it’s different from the original, we use #become: to convert references from the empty object to the fully populated one.

Why do it this way? In GemStone 2.3, self become: other is not allowed, which is why the #become: is triggered based on the return value instead of being implemented in the collection population method itself. This means that every populating method needs to return self, and we pay performance penalties for the identity check and #become:.

The other consequence is that our collections aren’t created with perfectly-tuned sizes (e.g. Set new: 25). Instead, they grow like normal, so they will inevitably have more internal “empty space”. In my Pharo tests, the model was 38% bigger. To me, this isn’t a very big deal; these collections will likely grow in the future anyway. We could solve it by more complex creation (e.g. store all elements in a temporary collection, then create final objects using #withAll: and such), but the extra code doesn’t seem worth it.

Credits

Everything useful that I’ve ever learned about GemStone has come from the documentation (which is excellent) or has been spoon-fed to me by Dale Henrichs and Joseph Bacanskas. Thanks everyone.

7 Comments

Filed under GLASS, Seaside, Smalltalk

Button Graphic Generator in VisualWorks/Cairo

On one of our projects, our customer wanted us to use some pretty buttons instead of the standard browser widgets. He’s not a graphic designer, but he’s got a good eye and he mocked something up in Apple’s Pages program. Then I implemented his design in VisualWorks using the Cairo vector graphics library and Pango text layout library.

I’ll assume a basic familiarity with Cairo in this post. If you need a quick primer, have a look at the Cairo Tutorial.

The VisualWorks binding for Cairo is a is written by Travis Griggs and available in the Cincom Public Store Repository. It’s a fairly thin layer, but it is very clean and provides some nice mappings from VisualWorks graphics objects (Point, Rectangle, etc) to the underlying Cairo functions. The only tricky part about it is that you have to find some Cairo binaries somewhere. On my Mac, I use MacPorts and install the “cairo” port.

Back to the button. Here’s a screenshot from the Pages document that our customer sent:

buttons-screenshot-pages

Icons are from the free FamFamFam Silk Icons set, which is full of useful little 16×16 PNGs. Otherwise, the rest of the button is just a rounded rectangle, a gradient fill, label text, and a drop shadow.

To start, let’s make a class for these guys. Each button will have its own image and text, and I’ve imported the CairoGraphics namespace here so that I don’t have to scope all of my Cairo class references:

Smalltalk defineClass: #CairoButton
  superclass: #{Core.Object}
  indexedType: #none
  private: false
  instanceVariableNames: 'image text '
  classInstanceVariableNames: ''
  imports: '
      private CairoGraphics.*
      '
  category: ''

Then, our basic API. We need to be able to hand it an image (preferably by filename), a string for its text, and tell it to write itself to a PNG file (again by filename). We will use Cairo’s built-in PNG functions for load/save, and somewhere in the middle we’ll do some drawing:

text: anObject
  text := anObject

imageFile: aFilePath
  image := ImageSurface pngPath: aFilePath

writeToPngNamed: aFilePath
  | surface cr |
  surface := ImageSurface format: CairoFormat argb32 extent: 100 @ 40.
  self drawOn: surface context.
  surface writeToPng: aFilePath

drawOn: cr
  cr
    source: ColorValue white;
    paint.

We can invoke this now, and regardless of our input it writes a boring, 100×40 file, but at least we’re running end-to-end.

(CairoButton new)
  imageFile: 'famfamfam_silk_icons_v013/icons/add.png';
  text: 'Add note';
  writeToFile: 'addNote.png'

Next, let’s draw the rounded shape of the button. We’ll make it just a little smaller than the image itself to allow room for the drop shadow, and we offset by 0.5 to stroke just one pixel of border. Travis built a handy rounded-rectangle path helper #rectangle:fillet: into the CairoGraphics package, so we use it here:

drawOn: cr
  cr
    source: ColorValue white;
    paint.
  self drawShapeOn: cr

drawShapeOn: cr
  | extent |
  extent := cr surface extent - self shadowRadius.
  cr lineWidth: 1.
  cr rectangle: (0.5 asPoint corner: extent + 0.5) fillet: self cornerRadius.
  cr
    source: self borderColor;
    stroke

shadowRadius
  ^4

cornerRadius
  ^8

borderColor
  ^ColorValue brightness: 0.749

addnote1This gives us a decent-looking rounded rectangle. We’d like to fill it with a gradient, which is easy enough to add. We’ll simply re-use the same path and tell Cairo to fill it with a custom gradient based on the one in the Pages file. Cairo’s linear gradient exists in 2D, with two endpoints and color “stops” at proportional distances between the endpoints. Since we’re trying to match what Pages did, we’ll build a gradient that starts at (0@0) and goes to (0@height).

drawShapeOn: cr
  | extent |
  extent := cr surface extent - self shadowRadius.
  cr lineWidth: 1.
  cr rectangle: (0.5 asPoint corner: extent + 0.5) fillet: self cornerRadius.
  cr source: (self backgroundGradientFrom: 0 @ 0 to: 0 @ extent y).
  cr fillPreserve.
  cr
    source: self borderColor;
    stroke

backgroundGradientFrom: aStartPoint to: aStopPoint
  ^(LinearGradient from: aStartPoint to: aStopPoint)
    addStopAt: 0 colorValue: ColorValue white;
    addStopAt: 0.43 colorValue: (ColorValue brightness: 0.9);
    addStopAt: 0.5 colorValue: (ColorValue brightness: 0.82);
    addStopAt: 1 colorValue: (ColorValue brightness: 0.95);
    yourself

addnote2Now we have a shaded button, but with nothing on it. The image and the text will be inset from the edge of the button by 9 pixels horizontally and 8 pixels vertically; and there will be a spacing of 5 pixels between them.

drawOn: cr
  cr
    source: ColorValue white;
    paint.
  self drawShapeOn: cr.
  self drawImageOn: cr.
  self drawTextOn: cr

padding
  ^9 @ 8

spacing
  ^5

drawImageOn: cr
  cr saveWhile:
    [cr translate: self padding.
    cr
      source: image;
      paint]

drawTextOn: cr
  cr source: self textColor.
  cr moveTo: self padding + ((image width + self spacing) @ 0).
  (cr newLayout)
    text: text;
    fontDescriptionString: self font;
    showOn: cr

font
  ^'Arial Bold 14px'

addnote3We’re getting closer. The button has its image and text, but its size is still fixed at 100×40. We really ought to measure the text and make our image size match. Fortunately, the Pango library (which we used to draw the text via the #newLayout method) can give us measurements for our text, so we gather that information before we create our initial PNG surface. Our image size will be derived from the text size, padding on all sides, the image size, spacing between the image and text, and the radius of the drop shadow:

writePngFileNamed: aFilePath 
  | surface |
  surface := ImageSurface format: CairoFormat argb32
        extent: self textExtent ceiling + (self padding * 2) 
            + ((self spacing + image width) @ 0) + self shadowRadius.
  self drawOn: surface context.
  surface writeToPng: aFilePath

textExtent
  | surface |
  surface := CairoGraphics.ImageSurface format: CairoFormat argb32
        extent: 1 @ 1.
  ^(surface context newLayout)
    text: text;
    fontDescriptionString: self font;
    extent

addnote4In #textExtent, I had to create a Cairo surface to use as a basis for Pango’s measurements. I could have used my “image” instance variable, but that didn’t seem like a healthy dependence to me. Better to create a scratch surface and throw it away.

Finally, we want a drop shadow. This is where we have to fake things a little. Cairo doesn’t have a “blur” operation, so we’ll take the shape the button we’ve drawn and smear it around. To avoid having to draw our shape several times, we use Cairo’s built in layering capability to draw our shape on a separate layer, then repeatedly place this layer down in the drop shadow region (thanks to Travis Griggs for the help on this):

drawOn: cr
  | button |
  cr
    source: ColorValue white;
    paint.
  cr pushGroup.
  self drawShapeOn: cr.
  self drawImageOn: cr.
  self drawTextOn: cr.
  button := cr popGroup.
  self drawShadowOf: button on: cr.
  cr
    source: button;
    paint

drawShadowOf: button on: cr
  cr saveWhile: 
      [cr translate: (Point r: self shadowRadius / 2 theta: 45 degreesToRadians).
      cr source: (ColorValue black alpha: 0.08).
      0 to: 359
        by: 45
        do: 
          [:n | 
          cr saveWhile: 
              [cr translate: (Point r: 1 theta: n degreesToRadians).
              cr mask: button]]]

addnote5Now it looks just like the customer wanted, and all we have left to do is take out the white background that we forced into place by removing the first two lines of #drawOn:.

Then, put your free icons and creative imagination to work:

closegitmo

deletetelevision

leavemeeting

shivertimbers

stopcomplaining

trysmalltalk

3 Comments

Filed under Smalltalk

Mold: Form Validation for Seaside

A long time ago, I asked about systems for form validation that aren’t “model-based”. By “model-based validation”, I mean that the rules for whether a certain sort of input is acceptable are declared in (or attached to) a domain model that the form is operating on. This is the way that ActiveRecord (Rails) and Magritte (Seaside) work.

I don’t like the whole approach, for reasons I discussed in my earlier post. It breaks down when you want to edit an object in stages, and you end up managing lots of UI-related things down in the model code. But forms are boring and validation can be tedious, which is why frameworks like these have been built in the first place.

What I really wanted was a set of simple helpers that made this work easier — a “less is more” approach to the problem. That didn’t exist (at least in Seaside), so I built my own.

It’s called “Mold”, and it’s available both on SqueakSource and in the Cincom Public Repository. The name is a play on words: it’s a synonym for “form”, but it makes most normal people think of green fuzzy fungae. 

Design Principles

In building Mold, I had several specific goals:

  1. No new components. I didn’t want to have to remember to add anything to #children just because I was using this framework.
  2. Keep track of what the user types, even if it’s not valid, to allow for easy correction of mistakes. “r56” might not parse to a valid integer, but it’s better to let the user delete the “r” than force it to “0” and complain at him.
  3. Emit real objects, not just strings. If I’m asking the user to enter a date or a time, I want a Date or Time object when all is said and done.
  4. Use block-based validation criteria to keep things flexible. Error messages should allow blocks to be used too, so that your can put dynamic behavior in there too.
  5. Correlate errors closely to the problematic fields. It’s more helpful to show an error message right next to the field than to show it at the top of the page.
  6. Strip leading and trailing whitespace, and convert empty strings to nil. Semantically, an empty string almost always means “nothing” and is rarely worth keeping around.
  7. Callback or dictionary-like access to valid data. Sometimes you want to grab bits of data out of the form by name, but most of the time it’s nice to have the form dump valid results right into your model in one step.
  8. Don’t require all-or-nothing use. I might want to use the helpers to build part of the form, but handle other parts myself. It should be possible to completely customize the look of each form without sacrificing the benefits of the framework.

The Basics

To use a mold, you typically instantiate and hold a Mold in an instance variable of your component. For a simple form with no underlying model, you might build the mold when the component is initialized.

initialize
  super initialize.
  self buildMold

For editors with a model under the hood, it makes sense to build the mold when the model is passed in:

account: anAccount
  account := anAccount.
  self buildMold

The mold itself has a canvas-like API for declaring a form’s inputs and validation conditions.

buildMold
  mold := Mold new.
  (mold stringField)
    label: 'Username:';
    on: #username of: account;
    beRequired.
  (mold passwordField)
    label: 'Password:';
    on: #password of: account;
    beRequired. 

In this simple form, we only ask for 2 things, and we hook them directly to accessor methods on the model using #on:of:. This works just like it does in a regular Seaside form, and behind the scenes it simply creates a callback block. You can also create a custom callback block yourself.

When it comes time to render your form, you have to hand the `html` canvas to the mold, and then it gives you lots of handy shortcuts. Basic use will look something like this:

renderContentOn: html
  html form:
    [mold canvas: html.
    mold paragraphs.
    (html submitButton)
      callback: [self save];
      value: 'Save']

This usage tells the mold to render those fields as HTML paragraphs, like so:

Generated with Mold, using the #paragraphs helper

The labels are real HTML <label> tags, and each group is a single paragraph (<p><label for=”…”>Username:</label><br /><input … /></p>) We could have also used #tableRows:

renderContentOn: html
  html form:
    [mold canvas: html.
    html table:
      [mold tableRows.
      html tableRow:
        [html
          tableData;
          tableData: 
            [(html submitButton)
             callback: [self save];
             value: 'Save']]]]

It’s more work to build the framework around the table, but the end result looks like this:

These are the only two “canned” looks for an entire mold, but it’s also possible to take the reins yourself and ask Mold to render single components for a completely custom look. More on that below.

The final step in using this mold is hooking up the #save callback. Let’s assume you’re using the super cool SandstoneDb framework to save your models:

save
  mold isValid ifFalse: [^self].
  account save: [mold save].
  self answer 

That’s all there is to it. The mold can tell you whether its inputs were valid, and if not, it will display error messages on subsequent renders. If it is valid, telling it to save will fire all of its callbacks, thereby applying the changes to the underlying model.

The way I use Glorp, the save method looks nearly identical, but you have to register your model object in a unit of work. Using the mold’s save actions to apply your changes inside the unit of work keeps Glorp’s deep dark change-tracking voodoo working.

save
  mold isValid ifFalse: [^self].
  self database inUnitOfWorkDo:
    [:db |
      db register: account.
      mold save].
  self answer

Improving Looks

Let’s look at what interactions with the mold look like. We declared that both fields should be required, so if you don’t type anything (and just click “save”) you’ll see error messages by each field:

This form is a little bland, and it’s spacing is awkward because Mold uses unordered lists inside those table cells. Let’s apply some simple CSS:

style
  ^'
label.required { font-weight: bold; }
label.required:after { content: "*"; color: red; }
.error { background-color: #ecc; }
.errors { color: red; margin: 0; }
' 

That looks better — we call attention to the required fields, error messages are shown in red, and fields with errors have a reddish background too. Let’s make the username field a little wider. We’ll do this by adding a #customize: block in the mold declaration:

 

buildMold
  mold := Mold new.
  (mold stringField)
    label: 'Username:';
    on: #username of: account;
    customize: [:tag | tag size: 40];
    beRequired.
  (mold passwordField)
    label: 'Password:';
    on: #password of: account;
    beRequired. 

Now the next time we build one of these components (remember, we built the mold when the component was initialized, so it won’t automatically be rebuilt just from a browser refresh), our form will look like this:

 

More Conditions and Inter-Relationships

Let’s modify the field a little further. If we require usernames have to be at least 3 characters long and passwords to have a digit in them, we need some more conditions on these fields. We should also make the user type the password twice to guard against typos.

 

buildMold
  | passwordField confirmPasswordField |
  mold := Mold new.
  (mold stringField)
    label: 'Username:';
    on: #username of: account;
    customize: [:tag | tag size: 40];
    beRequired;
    addCondition: [:input | input size >= 3] labeled: 'Usernames must be at least 3 characters long'.
  (passwordField := mold passwordField)
    label: 'Password:';
    on: #password of: account;
    beRequired;
    addCondition: [:input | input matchesRegex: '.*\d.*']
      labeled: 'Please make sure your password has at least one number in it'.
 (confirmPasswordField := mold passwordField)
    label: 'Confirm Password:';
    beRequired.
  passwordField
    addCondition: [:input | input = confirmPasswordField input]
    labeled: 'Passwords did not match'.
  confirmPasswordField
    addCondition: [:input | input = passwordField input]
    labeled: 'Passwords did not match'. 
There’s a bit more going on here, but it’s all pretty straightforward to use. A few things to note:
  1. We’ve added conditions on required fields, so these won’t be evaluated unless some input is actually given. If these fields were optional, we’d have to check `input` to make sure it wasn’t nil before asking it for its size.
  2. The fields can refer to each other, even out of order. We didn’t technically have to put conditions on both fields, but it makes the error messages look nicer if we do.
  3. There is no callback on the confirmation field. It simply exists for use by the main password field.
  4. When referring to the other fields, we asked them for their #input. This is the string the user typed (having been trimmed of leading and trailing whitespace and converted to nil if it was empty). We could have also asked for its #value, but the value is only valid when the field is valid (incidentally, fields also understand #isValid).

The resulting form looks like this:

One “Gotcha”

Under the hood, the out-of-order field processing is done with a hidden input with a low-priority callback. This is hooked up when the mold’s canvas is set (mold canvas: html). The callback doesn’t fire until all of the other input’s callbacks are processed, but it fires before the action callback from the button.

That means that you must set the mold’s canvas inside the form: [] block. Failure to do so will mean that your validations never get run, and the mold will always answer `true` when sent #isValid.

There might be a cleaner way to do this, but I haven’t found it yet.

Advanced Moldiness

Mold fields can also be given a symbol for a key, which lets you refer to them directly. This makes custom rendering possible, with messages like:

mold paragraph: #confirmPassword.
mold widget: #username
mold errors: #someOtherField

When you use #on:of: to hook up a callback, the key for the widget is automatically assigned to the selector you passed as the first argument. You can also set the key directly using #key:, and a subsequent send of #on:of: will not clobber it.

Error messages can also be blocks, which means that you can put the user’s input into the error message:

(mold stringField)
  key: #username;
  addCondition: [:input | (Account find: [:each | each username = input]) notNil]
    labeled: [:input | 'The username "', input, '" is already taken'].

There’s also no reason why you can’t add two or more molds to a component — say, one for basic settings and one for advanced settings that aren’t shown unless the user clicks “more choices”.

Adding fields is a matter of adding a new protocol on Mold and optionally a new Field subclass. I say “optionally” because some fields can be built as specialized versions of existing fields, like a string field with a default validation rule. The emailField is currently implemented this way.

In Summary

I know not everyone believes that model-based validation is a problem. But if all you’re looking for is a simple way to build a custom form, you might find Mold helpful. We’ve been using it internally for nearly a year, so I figured it was time to touch it up and share it.

Mold makes no requirements of your components or your model objects. It doesn’t use any metadata; it allows you to choose how you want each form rendered; and it doesn’t require you to use it to build the entire form. It’s just a helper, and it gets out of your way when you don’t need it. 

Just don’t leave it alone in a cool dark place for too long, or it might start to grow on you. :)

16 Comments

Filed under Seaside, Smalltalk

Seaside Presentation at 3CLUG

Travis Griggs and I will be giving a Seaside presentation titled “Lay Rails to REST” for the Tri-Cities Linux Users Group this coming Saturday (December 8th, 2007). If you’re in southeastern Washington state and interested in learning more about Seaside, and how it compares to Rails in particular, stop by West 248 at the WSU Tri-Cities campus at 1:00. It should be a fun and informative presentation.

Update: Yes, that’s south*eastern* Washington. Sorry for the typo earlier.

Leave a comment

Filed under Seaside, Smalltalk

My Full-Circle Journey Back to Smalltalk

I suppose it’s time to tell my story. I was a Smalltalk zealot in the late 90’s, but I left it behind when I started my own business. I’ve now come full-circle, and I’m finding once again that Smalltalk is the best tool for most of the programming work I do.

I learned Smalltalk programming from Travis Griggs while working at Key Technology. He mentored me in Smalltalk and other general programming methods, and I introduced him to Linux. Travis likes to give me more credit than I deserve for some of the work that we did together, but one thing was certain: we had a blast working together.

Key went through a difficult aquisition, and I spent most of my energy working to bring our two software teams to common ground. My wife and I travelled back and forth between Medford, Oregon and our home in Walla Walla (with our newborn son, our first) while I worked with developers in both locations. Diplomacy is not something I particularly enjoy, but I’m good at it and the team really needed somebody in that role.

But the end result was that I didn’t get much time to do any significant programming. The more time passed, the more I missed it. I didn’t get to work with Travis as much anymore, and our team was becoming fractured as our project list grew. Despite all of my efforts to bring people together, our team was dissolving under the increasing project load.

I had always wanted to work from home, and I was becoming burned out in my Key position, so I left Key and took up consulting work. Most of my clients needed web applications, so I became a web developer.

I built my first web applications in Smalltalk. Travis and I had written the original WikiWorks code, and I used our experience there to build my own HTTP 1.1 server. This later became the basis for the Swazoo server, which was built on my code at Camp Smalltalk 2000 in San Diego. I implemented my own Smalltalk Active Pages (abbreviated SAP, to distinguish them from ASP sites) and built several sites using them.

At the time, there wasn’t much web application work being done in the Smalltalk world, and the frameworks people were developing were unattractive to me. I had to support myself, from top to bottom, and this was starting to feel painful.

This was what drew me towards PHP, Apache, and PostgreSQL. The promise of safety in numbers was attractive — I would never have to wonder if my server was fully compliant with the HTTP specs. But the main attraction for me was PHP’s support for for various open source libraries. PHP made it easy to handle image uploads in multiple formats, draw custom graphs, and build other visuals on the fly. Editing PHP apps in vim was easier over low-bandwidth connections (like a cell phone data connection) than making a VNC connection to a headful server image, too.

So I left Smalltalk behind and built a few large PHP apps. Nearly all of them relied heavily on PHP’s graphics library bindings for generation of custom reports and editable page components. I built content management systems that let my customers edit their page titles, for example, and behind the scenes PHP would generate new navigation buttons from template PNGs and TrueType fonts. Another app managed a database of donors for a non-profit radio network and allowed call-center volunteers to easily enter pledges from supporters — while the studio crew saw up-to-the-minute totals on their screens.

Things went smoothly at first but got rougher as the applications got bigger. I made heavy use of the flimsy object model in PHP 4 and built my own object-relational layer, and this helped me survive as long as I did. But I grew increasingly frustrated as I tried to scale my PHP applications. There was no way around the ugliness of the code.

seaside.jpgAt that time, somebody pointed me at one of the early versions of Seaside. I had taken a look at Seaside a few years ago, and while it looked interesting, I confess that I didn’t get it. I saw how call: and answer: worked to give you a certain level of modalism in your application, but that was all I saw. The Seaside HTML writers were less mature then, too, and that was a deterrent to me. I liked being able to interleave PHP and HTML code in template files, but that was mostly because my pages didn’t use CSS effectively.

I missed the larger picture, so I went back to PHP.

When Ruby on Rails started to gain popularity, it was like a breath of fresh air to me. Rails is a very clean framework for web applications, and it required no paradigm shift. Much of the work I had done in PHP already followed some of the Rails patterns, but now I had a full-fledged object-oriented language at my disposal again. I thought I had finally found the best of both worlds — a thriving community, lots of support for graphics formats, easy database connectivity, and a nice clean dynamic language.

These days, that would sound like the end of the story. But it’s not. The truth is, I’m disappointed with Ruby. Rails is “classic web development done very cleanly”, as Ramon Leon says. And Ruby is my favorite non-Smalltalk language. But it’s not Smalltalk. In fact, I’ve come to see that Ruby missed the mark on several points.

  1. File-based source control. In a Smalltalk image, all of the classes and code libraries your application needs are available to you immediately. Tools like the Refactoring Browser are much harder, if not impossible, in Ruby.
  2. Reflection. Ruby’s reflection is OK, but it’s much weaker than the reflection in Smalltalk. If more of the Ruby base code was itself written in Ruby, and Ruby allowed better reflection, a Ruby code browser would undoubtedly be easier to write (and more powerful once written).
  3. Speed. The Ruby interpreter is slow. In Rails, I still end up writing more complicated database queries than I should have to, because the Ruby interpreter is so slow to process data in memory.
  4. No Keyword Messages: Travis and I ran some tests using keyword-style messaging using Ruby’s last-argument-collapsed-as-hash, and the performance was terrible.
  5. Blocks. Ruby simply missed the mark on on these. All blocks should be first-class objects, not just a layer of syntax.
  6. Live Interaction. Irb is good, but Smalltalk workspaces are much better.
  7. Debugger. The Ruby debugger feels like gdb compared to Smalltalk. Most of the time, people opt to debug using console output and interactive exploration using Irb. It’s better than printf, but not much.
  8. Class Library: The Ruby class library is much younger than the extensive class library in Squeak or VisualWorks.

As far as Rails is concerned, I have far fewer criticisms. But once you’ve looked at Seaside, and you really understand the approach Seaside is taking, Rails just looks old. I never want to marshal data through a URL again if I can help it.

I’ve also come to see that unless you’re a light user of a technology, you have to be able to support yourself. This is not a downside for Smalltalk, it’s a reality of the complex environment we work in. Smalltalk’s web and graphics tools have come a long way lately, but ultimately Smalltalk empowers you to support yourself in a way that Ruby just doesn’t. I have been digging in the Ruby 1.8 source tree more often than most, and it’s not nearly as easy to navigate as the Smalltalk base. Remember, most of the Smalltalk base is implemented in Smalltalk.

Seaside is certainly more advanced than Rails in terms of raw technology, to the point that it’s hard to wrap your head around the concepts. It took me two attempts, and I’m fairly young and intimately familiar with Smalltalk. Rails is smoother — it gives a better out-of-the-box experience in the short term. But for larger web apps, you need to be thinking further than that.

Don’t be surprised if Seaside just looks weird to you at first. Leave it for a little while and then come back. You’ll be glad you did. And when it clicks for you, beware. You won’t ever look back.

13 Comments

Filed under Rails, Ruby, Seaside, Smalltalk

Identity and Equality in Ruby and Smalltalk

One important concept in object-oriented languages is the difference between equality and identity. The concept isn’t complicated, but the English words we use to talk about it are imprecise. The thesaurus says that “identity” and “equality” are synonyms. So let’s back up and make sure we’re being clear on the concepts.

We often talk about variables as if they “hold” a certain object, but that’s not a very good metaphor. You can put the same object “inside” any number of different variables. If we’re working with containment as a metaphor, we run into problems here. It’s as if we’re saying that the same person is in two different houses at the same time.

Instead, we need to think of variables as names. Different people use different names for the same thing (Puma, Mountain Lion, Cougar) or the same name for multiple things (my “home” is not the same house as your “home”). Names are just a reference to something real, a way to address things.

We say that two variables are identical when they both refer to the same object. If they refer to different objects that represent the same value, we say that they are equal. Consider, for example two points that have the same X and Y coordinates. They are equal, but not identical. Objects themselves can tell you whether they’re identical or equal.

In Smalltalk, the relevant messages are:

= The objects are equal
== The object(s) are identical
hash A number that must the same for all equal objects, typically used by collections

The Smalltalk virtual machine is the sole place where object identity can be determined, so the #== method is implemented as a primitive and never overridden by any class. But scores of classes implement their own method for #= and #hash. A classic Smalltalk programming mistake is to override #= but not #hash, thus breaking the requirement that two equal objects have the same hash value.

In Ruby, there are actually significantly more equality messages that you may encounter. The definitions I’m using here come from pp. 95 and 571 of Programming Ruby:

== Test for equal value
=== Used to compare each of the items with the target in the when clause of a case statement
<=> General comparison operator. Returns -1, 0, or +1, depending on whether its receiver is less than, equal to, or greater than its argument.
.eql? True if the receiver and the argument have both the same type and equal values. 1 == 1.0 returns true, but 1.eql?(1.0) returns false.
.equal? True if the receiver and argument have the same object ID
hash Generates a Fixnum hash value for this object. This function must have the property that a.eql?(b) implies a.hash == b.hash.

This is a little overwhelming. Ruby’s :equal? method is a test for identity, and like Smalltalk, it is implemented in Object and never overridden. That gets us started.

The methods for :eql? and :hash are probably the most similar to Smalltalk. If you override one, you need to override the other. Ruby’s collections use them to determine equality for things like Array#uniq or Hash lookup keys.

This has bitten me more than once. I’ve implemented a new sort of object and overridden ==, only to find later that

  • I can’t use those objects for Hash keys, and
  • Arrays return unexpected results when sent :uniq

It seems that instead of overriding ==, I should have overridden eql? and hash. This makes my Hashes and Array#uniq results work like I expected. But wait… The default Ruby implementation of == defers to equal?, which is not what I want either:


>> g1 = GridAddress.new(:key)
=> #<GridAddress:0xb7b569c8 @key=":key," @row="nil">
>> g2 = GridAddress.new(:key)
=> #<GridAddress:0xb7b50028 @key=":key," @row="nil">
>> g1.eql?(g2)
=> true
>> g1.hash == g2.hash
=> true

>> g1 == g2
=> false

There’s no deferral from one of these messages to another. In other words, the behavior I want requires me to override eql?, hash, and ==. These are independent, parallel implementations. The difference is primarily semantic. With eql, there is no attempt to coerce the objects to a similar type. Methods for ==, on the other hand, typically try to coerce objects to the same types first before making comparisons.

Finally, we come to the “spaceship” operator, <=>. This is used by a mixin called Comparable, and you automatically get various other comparison operators (including ==) if you include Comparable and implement <=>. The only objects that understand it are those for which less-than and greater-than comparisons actually make sense.

In my opinion, Ruby complicates things here with little payoff. Most programs compare using ==, and the details of other equality comparisons are lost on most Ruby programmers. I think it’s a fair generalization that most Ruby developers don’t use their own objects as hash keys, probably due to a sort of bias towards the more “primitive” object types like Ramon Leon talks about (see “Obsession with Simple Types” in this post).

It’s also worth noting that if you override eql? or ==, you’re expected to check to make sure that the objects have the same type before you start comparing any details. This is the common pattern in Smalltalk, too. Most Smalltalk implementations of #= first check in some way to see that the two objects are the same kind of thing and then proceed to compare relevant details.

6 Comments

Filed under Ruby, Smalltalk