Using historically standard analysis techniques related to file placement on disks within the Unisys 2200 environment, it is possible to significantly improve the performance and capacity without significant additional outlays for hardware. Definitions of disk usage and file placement have been identified on a general basis as no longer relevant, as a means of following the "understanding" that modern disks provide sufficient native speed that file placement no longer matters. This is not a valid assumption.
Unisys 2200 Mainframes
Within the Unisys 2200 environment in most shops, the mainframe hardware systems are usually configured to run as a set of multiple virtual operating systems. This multiple VOS environment allows for both the partitioning of hardware to aid the most critical needs of the business and to allocate different capacities of the hardware to various VOS. It is possible to allocate 50% of the capacity of one mainframe to run the online (TIP) environment in order to provide high throughput, 25% to run production batch processes, and 25% to support application development.
Unisys 2200 Disk Farms
In addition to partitioning the CPU and memory, the disk farms attached to these mainframes are also capable of being partitioned, or of being identified as shared between multiple VOS. Ideally, portions of the disk farm will be partitioned into production disks and development disks, to maintain a strict physical separation between the production environment and the development environment.
Also note that the physical disks can be partitioned into various smaller configurations, such as two, four, or eight smaller partitions. In the most recent high-capacity disks, the heads of these disks are designed to be "at rest" when positioned over the center of the physical disk storage area, or at the boundary between partitions as shown in the following table.
Two-partition disks | Between partition one and two |
Four-partition disks | Between partitions two and three |
Eight partition disks | Between partitions four and five |
Unisys 2200 File Types
Within a production Unisys 2200 disk environment, there are five general types of files which need to be supported. Operating system files, including TIP absolute or ZOOM elements, are generally allocated onto specific physical packs. DMS-2200 files are pre-allocated with a specific size in support of the hierarchical database environment. RDMS-2200 files are generally allocated using a specific size based on initial estimates of the data to be stored, and can be increased in size as required to support long-term growth of data. Both DMS-2200 and RDMS-2200 data files can be structured as multiple partitions on one or more physical disks. The fourth type contains all non-database information related to the production date, generally known as flat files, and are usually capable of storing transient data, or data from a specific date, with cycling supporting older historical data sets (up to 32 cycles). The fifth (last) type are directory files, which contain elements. These files are treated as being a second-level directory to gain access to multiple different types of small elements, such as program source libraries.
Disk Numbering and Naming
In most shops, the numbering of disk partitions is set up using what is essentially a randomizing factor, such that the identification of the physical disk and partition number is completely masked from an operational viewpoint. This randomization totally masks the performance improvement information, and becomes an impediment to improving the performance of the applications.
A more accurate means of identifying the disks uses the following standard naming convention (or a variation) in order to more fully expose the performance data. Where the disk name contains six characters,
use the first two characters as "LP" for Logical Pack,
use the next two characters (as numbers) to identify the physical disk number,
use the next character (as a number) to identify the partition number,
use the last single character (as a number) to identify the ‘leg' of the pack
(in a mirroring environment, each leg may be significant)
LP DD P L
The name LP0141 would identify the logical pack located on the first physical disk, in partition 4 (near the center of the disk), and leg 1 (in a two-leg mirror environment). The name LP3280 would identify the logical pack located on the 32nd physical disk, partition 8, with no mirroring.
System Upgrade at Global Payment Services Company
During one of the most recent system upgrades at a leading payment services company, the target configuration was two mainframe hardware units divided into five VOS, one XPC for disk farm access, and a massive disk farm. There were some discussions prior to the installation of this newest environment as to how to configure the disks in order to provide the highest level of throughput for the TIP applications. A portion of the senior engineering staff stated that, since the inherent performance of the disks was so much better than the previous disks, that disk naming and file placement was no longer a concern. In order to determine if these claims were accurate, an experiment was devised and executed to determine the effect of head movement on data file access. The following results and actions confirmed that, even in the most modern environments, paying attention to the basics of configuration can make a significant difference in performance and capacity of core systems.
Preliminary Disk Performance Testing
After the installation of the newest set of disks in a pre-installation test and validation mode, using eight-partition formatting, a series of tests of data access speeds was conducted, in order to more closely identify the factors which could be significantly affecting database access.
In the trial, the goal was to identify the time to access files, without regard to the data within the files. The trial code accessed one file using the editor, then accessed a second file using the editor, and repeated this action 100 times. It was determined that execution of the trial code had to occur 100 times to produce a large and valid trial timing. It was found that the time to access two files located on partition 1 took 53 seconds. The time to access the same two files when they were located on partitions 1 and 8 respectively was 64 seconds. Thus, the time to access files took 21 seconds longer when the files were located at opposite ends of the physical disk. It was found that the time to access two data files can be fully described by the equation:
Speed(100 trials) = Base time(53 seconds) + ( 3% times Pack count)
Speed (53 seconds) = 53 seconds + (3% times zero = zero seconds)
Speed (64 seconds) = 53 seconds + (3% times 7 disks = 21 seconds)
Access Improvement
Based on the information above, it was determined that when disks are configured using eight partitions, the center two partitions (4 and 5) must be reserved for database files (and operating system files on non-database disks) in order to provide the fastest access speeds to these specific files. This "standard" will result in the least amount of disk head movement when accessing multiple database files in a standard TIP processing environment.
I-O Trace
Improvements in accessing the database files (DMS-2200 and RDMS-2200 types) can directly and significantly affect both the performance and capacity of these systems. Within the Unisys 2200 environment, each database file is assigned a unique and specific TIP-file number. Each of the disk partitions are also assigned a unique number.
During the execution of an operator-controlled "I-O TRACE", generally over a short interval (15 minutes), sufficient information is collected to be able to identify all of the (1) transactions, (2) database files, and (3) disks being accessed. This set of data can be easily reduced to provide a significant data pool to provide information on the actual performance of the application environment.
Data Reduction for Access Improvement
Using the data from an I-O Trace, a summary of the total amount of effort required to access each of the physical disks can be generated, preferably as a chart within an Excel spreadsheet. This chart will show which disks are experiencing a significantly higher access than other disks.
Further analysis of the data selecting down to the partition level will provide for a similar chart showing the high access points in terms of partitions, and can be further reduced on specific high-use partitions to identify specific high-use database files.
Moving the very high access data files onto the lowest access disks will allow the whole disk farm to become much more balanced. In some cases, it may be required to place single very high access database files onto disks by themselves in order to keep the disk heads positioned directly over the files. Note that the creation of partitioned files can assist in decreasing the access times for large files by placing portions of these high-access files on different physical disks.
System Throughput Improvements
Since the amount of time a TIP transaction expends on waiting for data from files is much larger than the amount of time spent executing in main memory against that data, changes in the data access can have large affects on the total capacity of the systems.
The specific configuration at the payment services company is defined to have four VOS partitions, each of which runs at a capacity of 50 transaction-seconds (50 1-second transactions every second), the total defined throughput is 200 transaction-seconds. If the average transaction time is 1.5 seconds, representing a total of 133 parallel transactions. If the average is 1.1 seconds, this would be 182 transactions. An average of 0.8 second would be 250 transactions.
Changing the disk access time of an average transaction from 1.1 seconds to 1.5 seconds reduces the system capacity by 27%. Changing the transaction time from 1.1 second to 0.8 second increases capacity by 37%. If your environment has an average transaction time of 1.5 seconds, and you can change it to 0.8 seconds by modifying file placements, you will have increased total system capacity by 47% without adding any hardware.
Fixed-Gate Subsystem Memory Resident Data
While the operating system does not have the ability to generate a data file in main memory, access to data stored within a subsystem can be provided directly by a "call" from any of the standard languages. This memory-resident data should be a copy of the "master" data stored in a disk file.
The structure of the calling environment must be able to identify the validity of memory-stored data, so the first value to be defined in the subsystem is a control flag indicating when the data is completely loaded from disk. If the flag indicates the data is not loaded, then the calling program must retrieve the data from disk. If the flag indicates the data is loaded, then the calling program can retrieve the data from memory; note that since the data is generally being accessed by multiple programs simultaneously, the return status of the memory call should be tested for validity (failure should result in either a retry, or a reference to the disk data).
Improvements Using Memory-Resident Data
In reviewing access to database files, one or more files will be identified as read-only type data files. The Unisys 2200 operating system has the ability to store data in main memory using "Fixed-Gate Subsystems." The amount of time required to access this fixed data can be eliminated (reduced to less than measurable values) by moving it into main memory in one of these "Fixed-Gate Subsystems." Amazing improvements in total system speed can be provided by moving these read-only data elements into main memory.