Log Buffer #78: A Carnival of the Vanities for DBAs

04 January 2008 » DB2, IBM, MySQL, PHP, System administration, XML

Happy new year everyone! This week I’m honored to host the 78th edition of Log Buffer, the weekly roundup of database blogs.

A special thanks goes to Dave Edwards of the Pythian Group for the opportunity to start the year right by catching up on the latest developments around the database world. I’ve been blissfully out of the loop planning a wedding, relaxing on the honeymoon, and spending time with family. :)

About this week’s news
Many folks were also off celebrating the holidays (or recovering from New Year’s celebrations), so it’s been a quiet week.

Without an earth-shattering announcement to stir up controversy, there’s been a trend towards end-of-year summaries, predictions for the new year, and time to jot down tips or otherwise reflect on projects that scratch the author’s itch.

I’m an IBM Web application developer – not a database administrator per se – so this week’s edition will offer my biased take on the news. I hope you enjoy anyway. :)

DB2 and Informix
First up, Chris Eaton encourages us to have a look at (and get involved with) the new PHP-based DB2 Monitoring Console project at SourceForge.

The DB2MC aims be the long awaited Web-based console for managing DB2 instances and databases, merging the role of the standalone Control Center shipped with DB2 and the simplified approach to database administration taken by the popular phpMyAdmin project favored by many MySQL shops.

Over at DB2 Magazine, Scott Hayes of DBI asserts that “performing excessive and unnecessary sorts is the number two performance killer in most databases” on the Linux, Unix, and Windows platform. Fortunately, he offers a few tips for neutralizing this elusive killer.

On the mainframe, Robert Catterall provides some tips for maximizing performance when accessing data by tweaking the size of blocks fetched over the network from DB2 z/OS.

Further good news for DB2 customers is that the always popular “Recommended reading lists” for database administration and application development at IBM developerWorks have been updated for v9.

On the Informix platform, the latest issue of the International Informix Users Group (IIUG) Insider has been published, which announces that registration for the IIUG Informix conference is open, announces board elections (man, those middle American states have a lot of electoral clout) and reflects on the year marked by the release of IDS 11 at mid-year.

MySQL
Over at The Open Road, CNET blogger Matt Asay reveals MySQL CEO Mårten Mickos’ reflections on 2007. The widespread adoption of several editions of MySQL 5 was a highlight this year, along with advancements in scale-out features such as replication, partitioning, load-balancing, and caching.

Mickos notes that MySQL continues to build on its strength as a Web database and expand into corporations to complement instead of compete with existing proprietary platforms such as Oracle.

In other integration news, there has been some traction on the planned DB2 storage engine and MySQL port to i5/OS. An IBM Redbook will be published by the end of the month.

Moving down to the bare metal, Mark Robson has decided to put down an explanation for the many users who ask him about the pitfalls of running out of address space (not memory itself) on 32-bit MySQL installations.

Short answer: Spring for a 64-bit machine and stock plenty of RAM, regardless of the underlying operating system. :)

PostgreSQL
Andrew Dunstan offers up source for a conditional update trigger that intercepts modifications if their values don’t differ from what’s already in the database. This filter can save the expense incurred by unnecessary index updates.

Leo Hsu and Regina Obe clarify PostgreSQL’s support for stored procedures (or lack thereof) for a user over at the Postgres OnLine Journal.

They retort; “So the question is, is there any reason for PostgreSQL to support bona fide stored procedures aside from the obvious To be more compatible with other databases and not have to answer the philosophical question, But you really don’t support stored procedures?” Touché, grasshopper.

Robby Russell points us to the call for papers at PGCon 2008, and is himself interested in seeing a presentation relevant to Ruby on Rails Web app developers.

Oracle
Howard Rogers provides a hefty PDF of the courseware he once used to teach a 5 day bootcamp – complete with exercises, slides and explanatory notes on “everything there is to know about Oracle.” But does not that mean the oracle also knows everything about Howard? Think about it.

The seventy megabyte download targets 9i and has been partially updated for 10g, but the underlying themes should still be relevant for 11g.

Richard Foote provides details another subtle gotcha in his series on the difference between unique and non-unique indexes.

A befuddled Steven Karam details his root cause analysis of a problem upgrading Oracle 10 across x86 platforms. He found the solution despite a none-too-helpful error message. He concludes with a suggestion to Oracle for a better way to aid those who run into a similar problem…

Matt Topper announced a new way to keep up with Oracle news, a link-sharing site called Ora-Click.com. For those groaning “not another social network for geeks,” this is a subject specific site and looks quite slick. I can see this model being emulated by other technology or product knowledge domains.

Eddie Awad is already on board with the Ora-Click idea and has offered a few suggestions for making it even more useful.

SQL Server
There have been quite a few posts about learning the new features of SQL Server 2008 ahead of its hotly anticipated February release.

SSQA.net provides us with a pointer to virtual training courses that Microsoft is offering through the end of January ahead of the 2008 general release. This ten part Web seminar series covers topics ranging from high availability to manageability, security, business intelligence, and reporting.

Bob Beauchemin has a trio of tips for using the new features of SQL Server 2008. There are some tips on plan guidance, as well as a pointer on using row constructors.

Thrudb
With all the buzz surrounding SimpleDB in December, Ilya Grigorik, CTO of Igvita details Jake Luciani‘s “faster, cheaper alternative” to Amazon’s offering. So far the reviews are positive. If you’re into document-based databases or S3 storage, this is worth a look.

CouchDB
Anant Jhingran and Sam Ruby have announced that Damien Katz of CouchDB will join IBM over in Information Management. In addition, CouchDB will be donated to the Apache Software Foundation as a top level project.

ObjectStore
Dan Weinreb, co-founder of Object Design which developed ObjectStore, carries on the backlash against Michael Stonebraker with a detailed account of how object-oriented database technology did indeed succeed from both a business and technical perspective.

In a follow-on post the same day, Weinreb delves into more detail about the lessons learned when creating ObjectStore.

In the words of the great General Kenobi, “Luke, you will find that many of the truths we cling to depend greatly on our own point of view.”

 

And that wraps it up for this week’s Log Buffer. I hope you have a good time reading, but make sure you don’t spend all weekend in front of the computer, there’s plenty of good old analog wild card action to follow. Go Giants!

Have a great 2008!

Instant XML feeds via the JSTL SQL tags

20 December 2007 » DB2, Java, MySQL, Web architecture, WebSphere, XML

A dusty old Java tag library can help conjure up siloed Web site data for new uses.

Some background
I’ve developed a number of server-side Java Web applications over the years, first with scriplets embedded in JSP, then with the template and tag driven paradigm offered by ATG Dynamo before the J2EE standards, and most recently with the Model-View-Controller architecture pattern in Struts and Spring MVC.

Each of those technologies (mostly) improved on its predecessor and enforced a better separation of concerns between the database, application logic, and presentation of the end result in the browser. This in turn has helped my teams divide and conquer Web application development among specialized job roles.

That’s why I’ve long been puzzled why the SQL tags in the JavaServer Pages Standard Tag Library exist as a standard part of J2EE 1.3 onward. These tags enable a front-end developer to embed SQL directly into a JSP page without the need for scriptlet code.

This tag library seemed an ill-conceived reversion (anti-pattern even) to the days before MVC took hold as a best practice in the Java world, and I’m pretty sure I skipped that section of the objectives when studying for the SCWCD exam.

That said, the SQL tags came in pretty handy this week for a particular challenge, and the more I think about how I can use them beyond their intended purpose, the more every new requirement I see looks like a nail.

My particular application context
I support a content management application which was designed, developed, and deployed circa Web 1.9. It’s stable, performant, and most importantly, met its functional requirements of the day.

In the two years that it’s been deployed, several new requirements have arisen that have expanded its anticipated scope as a traditional Web application.

In particular, the ubiquity of XML feeds have driven the need for it to present its core data outside of the templates existing in the confines of its own Web site. The rise of tagging and the popularity of multimedia as syndicatable content has also made it creak.

Compounding the architectural limitations of the application itself is its inflexible hosting environment. The data center that this site is deployed to is governed by CYA-driven restrictions (rightly so) which constitute a barrier to frequent application deployment cycles that add new functionality.

This environment makes it difficult to adopt nascent technological advances – the next big thing in “coolness” or usability – but have also kept it exceptionally stable and available to meet its codified requirements without introducing undue legal or financial risk.

The application itself consists of two subcomponents. There is a Web application module on the secured intranet for authors to generate new content, and a publicly accessible read-only Web application module to display published content.

It’s primarily this latter Internet application where the use of JSTL SQL tags comes in most handy, but I can imagine uses on the intranet side as well (ad hoc reports, for example).

The case for SQL tag driven XML feeds
The JSTL standard defines a tag library for issuing queries against a data source defined in the Web deployment descriptor without using JDBC in Java scriptlet code in a JSP.

If this sounds like a simple concept that harks back to the type 1 JSP days, that’s because it is. The documentation shows its own apprehension about the inappropriate use of these tags:

The JSTL SQL tags for accessing databases … are designed for quick prototyping and simple applications. For production applications, database operations are normally encapsulated in JavaBeans components.

But therein lies their simplicity, flexibility and power for this particular production application scenario.

In order to take any slice of your data that can be exposed via a SQL query to the authorized user mapped to the JNDI entry for that data source, all you need to do is is write your query and iterate through the result set in an XML template defined in your JSP.

Think about that outside of this technology’s intended use as a prototype or simple application building block. Instead, imagine how you could use these tags to improve the value of a complex existing production application.

For example, suppose you’ve always provided an RSS feed for your latest ten published news stories. You’ve written your Controller or Action in your chosen MVC framework of choice and deployed it.

But now your users are demanding the latest five thumbnails of images published with a story to accompany its syndicated title and abstract in their latest mashup. Or perhaps they only want to see the last 10 stories which contain a given keyword.

What do you do? You could write a new Action or Controller and proper Command class in Java to meet that requirement. That would require updating some configuration files or deploying an EAR or WAR.

But look, you have an existing deployed stable application. Why risk introducing new code or downtime to a perfectly good application? Why not just free your data for use by your users’ new requirements in a quick hitting, low risk way?

Reuse your data by plugging in new JSTL SQL tag driven JSP files, don’t rebuild your application for every new data usage requirement.

To the tag library!
Ok, so you’ve read this far. I promise, the implementation itself will be much shorter :)

So your users want more information delivered via your feeds, or they wish to query by keyword or otherwise filter your data in a way you never anticipated.

Let’s see if we can free up that data for them.

  1. Write your query, with or without input parameters.

    SELECT ID, TITLE, ABSTRACT FROM NEWS_ARTICLES;

    SELECT ID, TITLE, ABSTRACT
      FROM NEWS_ARTICLES
      WHERE BODY LIKE ‘%?%’
      FETCH FIRST 5 ROWS ONLY;

    SELECT ID, TITLE, ABSTRACT, THUMBNAIL
      FROM NEWS_ARTICLES NA, NEWS_ARTICLE_IMAGES NAI
      WHERE NA.ID = NAI.NA_ID;

  2. Determine what XML format it should be in, whether a standard such as Atom or something custom like the following.

    <?xml version="1.0" encoding="UTF-8" ?>
    <results>
      <result id="">
        <title></title>
        <abstract></abstract>
        <thumbnail></thumbnail>
        <body></body>
      </result>
    </results>
  3. Tie the query to the format in a JSP file using the JSTL SQL tag library (and optionally, the Core tag library to escape output) and the JNDI name of the data source you already have configured in web.xml.

    Consult the documentation if you want to use placeholders.

    <%@ page contentType="text/xml; charset=UTF-8" pageEncoding="UTF-8" session="false"%>

    <?xml version="1.0" encoding="UTF-8" ?>

    <%@ taglib prefix="sql" uri="http://java.sun.com/jsp/jstl/sql" %>
    <%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %>

    <sql:setDataSource dataSource="jdbc/yourdatasource"/>
    <sql:query var="items">
      SELECT ID id, TITLE title, ABSTRACT abstract, BODY body, THUMBNAIL thumbnail
        FROM NEWS_ARTICLES NA, NEWS_ARTICLE_IMAGES NAI
        WHERE NA.ID = NAI.NA_ID;
    </sql:query>

    <results>
     <c:forEach var="row" items="${items.rows}" >
      <result id="<c:out value="${row.id}"/>">
        <title><c:out value="${row.title}"/></title>
        <abstract><c:out value="${row.abstract}"/></abstract>
        <thumbnail><c:out value="${row.thumbnail}"/></thumbnail>
        <body><c:out value="${row.body}"/></body>
      </result>
     </c:forEach>
    </results>

  4. Deploy the JSP file as your application server requires. If reloading is not enabled, restart the application (consider setting a 15 minute timeout or similar, so you gain the performance boost but provide a hook for updating JSPs individually).

Conclusion
That’s it, you’ve now added an aspect of functionality to your application which frees up any data you can query for via SQL (or XQuery, if your data server is so enabled). You’ve done it in a pluggable fashion and haven’t needed to build any new Java code within your existing application and its framework.

Of course, the flip side is that you’ve done it outside of your application framework and may have circumvented some well-intended best practices. However, you may to prefer to think of this approach as a temporary, low-risk way to share the data available to the users of your application in novel ways that may justify investing in the development of longer term solutions.

Ironically enough, this reversion to single file deployment can make an application buzzword compliant with the one of the most touted recent enterprise targeted architectural pattern – SOA. It reduces the barrier between the value an application has – its data – and the consuming end point of that data – to a simple JSP.

Thoughts on the Spring Framework

One of my goals for the year was to learn more about the Spring Framework, a layered set of modules that intend to make enterprise Java development easier.

In September, I took a class on Spring. Last month, I put together a presentation to share what I learned with my colleagues.

The bootcamp
I attended the Core Spring bootcamp in New York City. If you’re looking to learn about Spring in a classroom environment, there is perhaps no better source than Interface21‘s consultants as they are the company behind the project.

The course materials were solid and the class revolved around a real-world application scenario. For each segment presented, corresponding before and after projects for the lab were available in a workspace used by the free, Eclipse-based Spring IDE that was distributed to the class.

A copy of Pro Spring was included with tuition, but I’ve found the newer Spring in Action, second edition a better read. The instructors also noted that the older Pro Spring book had information on AOP and transactions that weren’t up to speed with the latest 2.x versions of Spring.

As you would expect from a technology that in practice killed EJBs, the Spring bootcamp focuses on middle-tier applications. However, there was good coverage of Spring MVC, which I hope to a adopt for a new project built on WebSphere 6.1 and J2SE 5.0.

Like Baldwin home from Paris
Some time after the course I had the opportunity to present an overview of the Spring Framework to my department. The audience consisted of developers as well as project managers and other non-technical types, so the goal was to keep the presentation accessible as well as sufficiently informative.

To sum up the philosophy of Spring in a nutshell, I hit on three characteristics with real world parallels. I hope they went over well, and I think some of the other learning materials out there would benefit from a similar approach. It seems the ideas behind Spring could be conveyed easier than they often are (another reason I like Spring in Action).

  • Inversion of Control / Dependency Injection
    Give code what it needs, don’t make it ask
    Does your car buy its own gas? Should it have its own credit card? Or use yours?
  • Interchangeability through Interfaces
    Keep things “black box” enough that you can swap individual pieces out when you need to
    Your twenty year old lamp can take either a traditional incandescent light bulb or a new compact fluorescent light bulb which provides a 75% more efficient way to do the same thing in the same standard socket.
  • Aspect Oriented Programming
    Terrible name and difficult vocabulary for an easy-to-grasp concept
    Consolidate logging and access control code, don’t repeat it all over the place
    Imagine an assistant tracking your billable hours to several projects during the day for you while you focus on your real job.

Silly middle-tier guys, or, what’s with that goofy .htm extension?
I have one outstanding gripe about Spring. Spring MVC in particular. I let it slide in the class, but since the foolishness is repeated in other materials for learning Spring I felt I had to speak my mind.

In a lot of examples for configuration of the Spring MVC DispatcherServlet, folks will tell you that “.htm” is the preferred convention for the mapping requests to the front controller.

The argument goes that we should trick the search engines into thinking this is static HTML content and besides, the mapping reflects that even though the page is dynamic, it is indeed rendered in HTML. Furthermore, the “.do” convention from Struts is pure tomfoolery.

This is dogsqueeze.

I take exception with all those arguments. In fact, I’d go as far as to say the “.htm” extension for dynamic content is actually harmful:

  • Most developers deploy Java Web applications to an application server which is not also the front-end Web server. Application servers, such as WebSphere, are configured to serve only dynamic content, such as servlets and JSPs. The HTTP server, such as Apache, serves the static files such as HTML documents, stylesheets and images.

    What if you really did have a static file with a “.htm” extension deployed by one of your Web developers? Do you want to see her go bonkers when she can’t figure out why her file is not being found because you decided to map that extension to your servlet? WTF?

  • Do you really want to make the impression that you’re hosting your Fortune 500 or high volume e-commerce site on MS-DOS?
  • The “search-engine friendly” argument for the “.htm” extension goes out the window when you start using query string parameters anyway.
  • You’ve introduced goofiness at the cost of elegance, another tenet core to Spring. Sigh.

But honestly, if you’re going to go through with it anyway, just add the damn extra “l”, filename limitations (virtual as they may be) are a thing of the past.

Native XML Databases at NYPHP next week

17 October 2007 » DB2, Java, MySQL, PHP, XML

Elliotte Rusty Harold will offer his take on Native XML Databases at New York PHP next Tuesday night in Manhattan.

The presentation follows a mailing list thread and resulting blog post that generated a lot of interest and discussion on the topic. It should be a great talk for database administrators, application developers and content producers alike:

While much data and many applications fit very neatly into tables, even more data doesn’t. Books, encyclopedias, web pages, legal briefs, poetry, and more is not practically normalizable. SQL will continue to rule supreme for accounting, human resources, taxes, inventory management, banking, and other traditional systems where it’s done well for the last twenty years.

However, many other applications in fields like publishing have not even had a database backend. It’s not that they didn’t need one. It’s just that the databases of the day couldn’t handle their needs, so content was simply stored in Word files in a file system. These applications are going to be revolutionized by XQuery and XML.

If you’re working in publishing, including web publishing, you owe it to yourself to take a serious look at the available XML databases. This high-level talk explains what XML databases are good for and when you might choose one over a more traditional solution. You’ll learn about the different options in both open and closed source XML databases including pure XML, hybrid relational-XML, and other models.

As always, the meeting at IBM is free and open to the public, but you must submit your RSVP by 6PM EDT Monday, October 22nd.