A journey in IBM systems

DB2 is the database management system from IBM. Currently it has the second largest market share of the “big three” RDBMS. Oracle is the first and SQLserver a distant 3rd, although SQLserver is growing by leaps and bounds. DB2 is installed automatically with system I operating system, and has many benefits from being integrated with the O.S.
There is a little jargon that is particular to DB2. First and most important is a “Physical File” which is a table in the database. One thing that I am not a fan of is the fact that there is not a way to do referential integrity though such things like foreign keys. Of course there is easy ways to make unique keys, or it would not function as a Database.
Second is a “Logical File”, which is built off of physical files and works all lot like a view in SQLserver. A logical file is a “sorted path” or saved query result. If you have a PF or several that get joined together a lot, you should build a logical file to speed up data retrieval. Logical files can be used to update a physical file as long as the logical file only accesses one PF and not a join of two or more PFs.

The most basic way a database stores data is with character sets. A character set is the way we change character data into 1s and 0s so that computers can understand them. There are two large character sets that are in use today but there is a third worth mentioning. They are ASCII, UNICODE, and because this is a blog about the IBM systems we will talk about EBCDIC.
EBCDIC is a character set that was developed by IBM to work with mainframes. It is not in use outside of IBM systems and is not well liked in the modern computer industry as a whole. One criticism of EBCDIC is that the codes are not in sequential order. The letters a - z are not next to each other, and there for it is a difficult code set to do sorts. Another problem is EBCDIC stores data 8 bits wide and there for is difficult to use in languages other then English.
The big hitter in the computer character set family is ASCII. Unlike EBCDIC is stores most of the characters in a logical human order. It is the most commonly used character set in use today, but it does share a downfall with EBCDIC in that it to is only 8 bits wide. In a expanding global market a new way of storing data needed to be invented.
UNICODE will probably be the Character set that shoves ASCII out of the lime light. UNICODE was developed to store Asian languages, because the most of them are not character based. UNICODE has the ability to store more different types of data by taking up twice as much data space. A single character might cost you 8 bits of space with UNICODE it would be 16. For best practices you should use UNICODE is memory is not an issue, or you will be storing data in multiple languages, especially languages like Chinese.

A journey in IBM systems

Saturday, January 29, 2011

DB2 overview

Sunday, December 12, 2010

Character Sets