Monday, June 16, 2008

de novo project retrospective

I’m doing the finishing touches on my “webtwo” application and thought I'd post a quick post mortem on the project.

The core set of functionality was straightforward:


  • Users with system defined logins (not relying on database logins) and system specific roles

  • Competitions consisting of user submissions. Each submission can have multiple submission items consisting of text or images.

  • Everything can be tagged and commented upon.

  • Tags can be reused and applied to any object in the system.

Here are my observations


My initial design had ratings as a separate class/table. I decided to push ratings down into both tag and user_text (the generic data type that holds comments) since, upon further consideration, I thought that the semantics of the term “rating” was different in each case. If the rating is attached to a comment, both the rating and the comment refer to some other item, and the comment explains the rating. On the other hand, if the rating refers to a tag, it reflects how strongly the item reflects the tag term.


Even though tagging and commenting (adding user text) are distinct functionalities, much of the persistence operation is the same. I therefore packed persistence into a single class which persists tags and text in distinct methods. At some point I may factor this into three classes: one for the common persistence core and two for the specific tagging commenting functionality; but this single class is adequate for the current purpose.


As mentioned previously, my initial design allowed fairly large scale image uploads with dynamic resizing for page display. I have since switched to a more conventional architecture in which standard image sizes are generated and stored upon upload. This is a big performance win, and also reduces storage requirements.


Seam is a platform that is evolving in an encouraging way -- when I wanted to add sorting to a table displayed within a ajax tab -- I could not see how to do it but upon further investigation it appears to be supported in an upcoming release.

e.g., in this view I can sort


while in this I cannot

Type X

For this system I had separate "type tables" for each type category e.g., user_text_type holds the allowable types of user_text such as comment, description etc. As I consider building larger systems with similar functionality, I plan to have an item_type table which holds the categories for every table that requires typing. This table would share a similar pattern to that used for user_text (comments, descriptions) and tags which holds the table_name and the id of the referenced object. This will simplify understanding the data model/code and facilitate building curation tools.


One additional note about the seam framework: it sometimes feels like an &rest interface, since it is easy to bookmark access to a particular object etc.. However, it does not do so without “seeing past” its own request: parameters which are not specified for "pass through" in the .page.xml files are stripped out. It is easy enough to add them -- all that is required is to specify a < name="">parameterForPassthough"/>
in the .page.xml for each parameter that should be passed through. This annotation also needs to be in every intervening page.

I'm not particularly bothered by this, I'm just pointing it out. On the good side it forces standard parameter naming so that each page's logic can operate upon standard variable names (assuming all of the necessary parameters have already been incorporated into the .page.xml files) and it prevents the urls from becoming unwieldy. The downside is that all of the intervening pages do have to be modified if a new parameter is required. In balance it seems to be a reasonable decision.

Note: You may ask "when would this apply?". One example: you want to comment upon an item and return to it when the comment is complete. In this case the item/item_id would need to be passed thought the comment-editing-page to the comment-verification/completion-page so that the "done/comment-complete" button can return to the right location.

The only real regret that I have about the project is that I didn't know about the yahoo design patterns earlier.

Friday, June 13, 2008

Yahoo Design Patterns

When visiting Yahoo's site for OmniGraffle stencils, I found that Yahoo also provides a nice set of design patterns.

These Web/Web2.0 patterns are clear, concise and have pointers to how they are used within Yahoo. Note: this doesn't constitute a strong endorsement on my part. My approach to patterns is to broadly survey existing patterns with the goal of settling on one, or worst case to modify/create one using the results of the survey.

BTW OmniGraffle is an amazing drawing program -- I've been using simple/mid-level drawing programs for a while and its interface is in a different league entirely. It does things that I haven't seen before, hadn't thought of previously, but are totally obvious to use and highly useful. If you're on a Mac you owe it to yourself to give it a spin.

Friday, June 6, 2008

debugging hibernate/mysql systems

Just a quick note, on debugging hibernate/mysql systems.

In my ‘webtwo’ system I had the following constraint in the db:

constraint fk_submission_item_image foreign key (image_id) references image_data(id) ON DELETE cascade,

However, image_data didn’t exist.

During operation it caused the error

18:34:56,009 WARN [JDBCExceptionReporter] SQL Error:
1452, SQLState: 23000
18:34:56,009 ERROR [JDBCExceptionReporter] Cannot add
or update a child row: a foreign key constraint
fails (`webtwo_dev/submission_item`, CONSTRAINT
`fk_submission_item_image` FOREIGN KEY (`image_id`)
18:34:56,009 ERROR [AbstractFlushingEventListener]
Could not synchronize database state with session
Could not execute JDBC batch update
at org.hibernate.exception.SQLStateConverter.convert
at org.hibernate.exception.JDBCExceptionHelper.convert

I thought this indicated a problem with my Hibernate/java mapping since I’ve been working with the database for a few months without any errors being thrown by MySQL during db creation. I also thought that I had successfully populated all the tables.

In retrospect this wasn’t the case: I had not populated the tables and it is also not surprising that MySQL didn’t signal an error since I “SET FOREIGN_KEY_CHECKS = 0; “ during table creation -- so, my bad.

The moral is that any unpopulated table may be fundamentally misspecified.

This experience does serve to reinforce my heuristic to populate all the tables via an initial data population script at least for testing purposes. Realistically, this requires two load scripts: one to populate the data required for operation (ie., roles, menu values, etc.), and a second to do a test load to verify database integrity.