There are a wide variety of tools manipulating and examining documents marked up with SGML. In particular, they deal cleanly with the structure described by the DTD and the markup. This makes SGML well suited as a "common ground" format for conversion of structured data.
Given a database format that already has self-describing fields, it is even possible to directly construct a DTD that covers all of the existing structure. An example of this is the GDB Generic Database format used by the HP palmtop applications. The application on the palmtop actually has a format editor that lets one easily change field names and types, as well as location on input screens.
The database format is published, and there are gdbload and gdbdump programs that convert them to and from comma seperated value format. I have modified gdbdump to use a -S flag to generate SGML output, including emitting the DOCTYPE declaration describing the data. The outermost element (covering the entire database) is named from the filename given to the conversion command. Within that, a dbentry entity is composed of all of the fieldnames stored in the database header, which are themselves declared as simple pcdata entities.
There is one odd case -- a record can contain a Note field with multiple lines of text. In order to preserve the line breaks, an empty crnl element is permitted within any element corresponding to a field which had the note attribute in the database description.
The USR PalmPilot palmtop also has several databases, though not using such a generic format. In fact, the current tools for the Pilot query the palmtop directly for records, rather than manipulating the database images available on the server side of a HotSync or pilot-link backup. Since the primary motivation for this effort was to handle conversion from the HP to the Pilot, the next step is to write a program that will
parse an SGML file that is formatted with a DTD specific to the pilot, and download it.
convert the SGML in the HP database DTD to the pilot-specific DTD.
Back-conversion is also important. Having gone through the effort of producing the data in a marked up form, it may be worth considering it the canonical form.