Monday 31 March 2008

Watch out for asteroids, IBM!

The new z10 mainframe has been receiving a lot of praise in the media, and quite rightly so. In fact I blogged about it a little while ago, giving a rough guide to some of the announcements highlights. But the fact remains that no matter how powerful the mainframe is, no matter how reliable it is, and no matter how "green" it claims to be, ordinary people – and those turn out to be directors and managers at many companies – feel that the right choice for them is NOT a mainframe. If mainframes were dinosaurs (as they were so often compared to back in the 1990s) then these Unix and Windows-based boxes are the birds and mammals, and along with some volcanic activity (I’m not sure how far I can stretch this analogy!), that small dot up in the sky could very well be an asteroid plummeting towards the Earth.

The Dinosaur comparison has been around for many years and Xephon originally produced its book the Dinosaur Myth over 10 years ago. The latest update, sadly now itself four years old, can be found at
www.arcati.com/dinomyth.htm. Although the figures quoted are out of date, the principle remains true that mainframes are cheaper to run than Unix systems or Windows-based systems. So, the right choice must be a mainframe.

IBM claims that the new System z10 is the equivalent of nearly 1,500 x86 distributed servers. They also say it has an 85 percent smaller footprint, and 85 percent lower energy costs. So anyone who does the maths (and assuming that IBM’s figures are accurate and not the result of PR company spin and exaggeration etc) would have to conclude that the only sensible choice is a mainframe.

Almost all the Fortune 500 companies (well alright, a very high percentage), particularly the banks and insurance companies and many airlines, rely on mainframes to keep their businesses going successfully. They know that the mainframe has amazing uptime figures – certainly compared to "other" servers. They know that problems with back-ups and restores, change management, etc, etc, were all addressed in the 80s and lessons learned the hard way (then) are now applied as a matter of course. So if the biggest companies in the world think mainframes are the right answer, doesn’t that make mainframes the hardware of choice for everyone else?

So, how come, in a recent search through some Web sites offering IT jobs in my area I found almost all of them were for companies without mainframes? There were Unix jobs of all sorts and Microsoft Server experience seemed to be highly desired. The programming jobs seemed mainly C (all flavours) with more asking for .Net than Java.

The Mesozoic Era – the period when dinosaurs ruled the Earth – began 251 million years ago and lasted about 186 million years. Just looking at the numbers of people working with computers and what computers they are using, it does look like the era is coming to an end. 186 million years is a long time to be king, I’m just worried that IBM isn’t adopting the right strategy to avoid the extinction-level event associated with this metaphorical asteroid. What do you think?

Tuesday 25 March 2008

Data centre automation

Data centre automation sounds like it ought to be simple enough to discuss, but first we have to decide on what we mean by a data centre. When I was a lad(!) it was obviously a mainframe sitting there with lots of DASD around it and some sort of front end for the communications. Nowadays, we could be talking mainframe, a mixture of mid-range boxes, and an always-up network allowing users to work from their browsers on the data. Although that makes things a bit more complicated, luckily, mainframe automation really got going as long ago as the late 1980s.

Back in 1992 I had a book published called, "ASO – Automated Systems Operations for MVS". And by then most organizations knew they needed to automate and were well on the way to doing so. Sites were using MPF (Message Processing Facility) to suppress messages, automated tape libraries were used to eliminate tape mounts – or the files were stored on disk. Suites of jobs no longer needed an operator to reply "yes" if some hash totals were correct so the next job in the sequence could start. In fact, the aim was to have the computer suite run by a man and his dog rather than the large number of operators previously employed. And the only answerable message that would come up on the screen would say, "Now feed the dog"!

As well as the mainframe, it was necessary to automate the network – starting, stopping, recovering, and monitoring. On the mainframe, back-ups and restores could be automated, and files could be migrated automatically. It even became possible to monitor the automation from a PC at home.

And then everything changed. It became necessary to look after files that were on Unix boxes or PCs. The world of distributed computing then became the responsibility of the IT department. And all the things that had been learned on the mainframe had to be relearned by users of these distributed systems. Everyone knew back-ups were a good thing, but no-one bothered until after they’d actually lost all their work, etc, etc.

And now, where are we with automation? Well we have autonomic computing – IBM’s term for self-fixing software. The software will manage the amount of a resource that is available depending on the current and the anticipated workload. It will manage the system based on policies that have been given to it. Asset management and capacity management are clearly understood concepts and are built into the software. Even change management is now automated, although with compliance issues and regulatory changes, it can be important to ensure the policies the software is working to are up-to-date and legal.

Recently announced was IBM’s new Tivoli Service Management Center for System z – part of the Service Management portfolio – which automates the management of mainframes and distributed environments. Using policies, it provides incident, problem, change, release, discovery, and business service management. The use of policies means that IT performance is linked to business goals, with the added benefits of lowering costs, and ensuring the system is secure and compliant with regulations.

The software includes IBM Tivoli Change and Configuration Management Database, IBM Tivoli Application Dependency Discovery Manager, IBM Tivoli Business Service Manager, IBM Tivoli Service Request Manager and IBM Tivoli Enterprise Portal. This software is combined with process automation best practices, based on the Information Technology Infrastructure Library (ITIL).

Users get a single view of critical applications hosted on their mainframes, discovering, controlling, and showing the linkages between IT assets and business applications, then enabling them to monitor the overall service delivery provided by IT in support of business objectives. The views can be customized for application mapping, relationship discovery, business services, services requests, finance, security, IT production, support, and operational control.

Last summer, HP bought Opsware to beef up its data centre automation products, and is now talking about "Data Center as a service". More recently, BMC, which has MAINVIEW AutoOperator and other products in its automation suite, bought Bladelogic, which specializes in automating data centres.

It seems data centre automation is still an exciting space to be in.

Monday 17 March 2008

Disaster recovery

I thought this week I’d talk about mainframe recovery following a natural or man-made disaster. It’s always an interesting and important topic.

The first thing to say is that the recovery strategy for a company should be driven by informed decisions by the board rather than by the IT department. It is the company directors who have to decide how long the company can be without IT services, how much data can be lost, and also how much money should be spent on it. This is where the IT department needs to help with the discussion – making it clear how much meeting the first two criteria will cost.

In a batch world, and yes I am old enough to remember those days, it was enough to have yesterday’s system up and running a few hours after the disaster occurred. This is quite a cheap option and for most companies is perhaps 20 years out-of-date. Nowadays, a company cannot afford to have its online system unavailable for very long at all, but how long is "long"? The company also cannot afford to lose any data, but a compromise will have to be made between cost and amount of data lost. And the IT department has to work within those constraints and with a sensible budget (also, as I said above, decided by the board).

For many organizations, the disaster recovery strategy involves having a standby mainframe supporting up to two other mainframes in a Geographically Dispersed Parallel Sysplex (GDPS) configuration. (In fact, this is called GDPS/MGM and has been available since November 2000). The big advantage of GDPS is that it is transparent to the applications running on the mainframe (CICS, DB2, etc). In an ideal world, there is no data loss, no impact on local operations, and no database recovery operations are required – however this does assume that there are no connectivity problems and that there are no write failures on the standby machine. There are, not surprisingly, some disadvantages. You need duplicate DASD, and there are also high bandwidth requirements.

GDPS makes use of Metro Mirror and Global Mirror (the MGM part of the acronym above). Metro Mirror (also called PPRC – Peer-To-Peer Remote copy) works in the following way. An application subsystem will write to a primary volume. Local site storage control will then disconnect from the channel. It will then write to a secondary volume at the remote site. Remote site storage control will then signal that the write operation is complete. Local site storage control then posts an I/O complete message. The advantage of Metro Mirror is that there is minimal host impact for performing the copy because the entire process is performed by the disk hardware. There are some disadvantages. There is a limit to the distance between the Sysplex Timers (about 25 miles if you want it quick, 180 miles if you want it at all). In some locations this might not be a problem, but in others it definitely could be. The other penalty is that DASD write times are all much longer than normal.

z/OS Global Mirror (XRC – eXtended Remote Copy) is an asynchronous methodology (Metro Mirror is synchronous). Global Mirror uses Global Copy and FlashCopy. At fixed intervals it invokes a point-in-time copy at the primary site. This doesn’t impact on local performance. Information from many volumes is then copied to the recovery site at the same time. Global Mirror allows there to be a greater distance between the main mainframe and the standby mainframe, but the data may be "old" – ie not current. On the bright side, the data may be as little as just a few seconds old. Recovery time using Global Mirror is estimated at between 30 seconds and ten minutes for applications to be automatically up and running.

Other useful parts of the GDPS family are xDR and HyperSwap. XDR is the short name for GDPS/PPRC Multiplatform Resilience for Linux for System z, which provides disaster recovery for Linux for System z users. This is particularly useful for distributed hybrid applications such as WebSphere.

HyperSwap can be used in the event of a local disk subsystem failure. The HyperSwap function is controlled by GDPS automation. HyperSwap is basically software technology that can swap in Metro Mirror devices at a secondary site replacing those at the primary site. The whole swap takes just a few seconds (ideally).

Luckily for users, GDPS works with all disk storage subsystems supporting the required levels of PPRC and XRC architectures – as well as IBM this includes disks from EMC, HDS, HP, and Sun/StorageTek. GDPS also supports Peer-to-Peer Virtual Tape Server tape virtualization technology.

Some people I have spoken to have mentioned problems with certain types of disk and GDPS, and even channel problems that have taken a while to fix. I wondered if anyone else had experienced what on the face of things seems like a very user-friendly solution to disaster recovery.

Monday 10 March 2008

IMS products

Now just suppose that you run IMS on your mainframe, and you wonder what products are available that make using IMS easier or extend its functionality, where do you go? You could type "IMS" and "products" into Google and then search through 1,860,000 entries. But then you discover that IMS as an acronym is used for lots of different things including: Information Management System, IP Multimedia Subsystem, Institute Of Medical Sciences, Immunomagnetic separation, IMTEK Mathematica Supplement, Indianapolis Motor Speedway, Industrial Methylated Spirit, Instructional Management Systems, Insulated Metal Substrate, Integrated Micro Solutions, International Musicological Society, Internet Map Server, Internet Medieval Sourcebook, Ion mobility spectrometer, Irish Mathematical Society, Irish Marching Society, Irritable Male Syndrome (what can they mean?), Insight Meditation Society, Iowa Mennonite School, etc. (Thanks to Wikipedia for the list.)

You could try searching on "Information Management System" and "products". Just 1,720,000 hits returned for that one.

You could simply try searching your favourite software vendors’ Web sites, but that won’t give you all the products that are available that work with IMS. It will probably only tell you about ones you already knew about.

But there is a solution. Try the Virtual IMS Connection Web site’s Tools page at www.virtualims.com/tools.html. This gets regularly updated and currently lists over 50 products from ASG, Attachmate, Attunity, BMC, CA, Circle Computer Group, Compuware , CONNX Solutions, Data Kinetics, DataVantage – Direct Computer Resources, GT Software, IBM, Iona, NEON Enterprise Software, and Seagull Software. Of course, if you know of an IMS product – and I mean Information Management System, not any of those others – then get in contact with trevor@itech-ed.com.

So what products are listed? Everything from Data Kinetics’ Accelerated In-memory data access for IMS to Compuware’s Xpediter/IMS.

The other part of the Web site that’s well-worth checking out at regular intervals is the News page – it’s at
www.virtualims.com/news.html. This page not only includes a short description of any new or updated IMS-related products, but it also contains a link to the original press release, so you can read the announcement in detail (including all the "tall-talking" adjectives and adjectival phrases!).

If you become a member of the Virtual IMS Connection User Group, you also get free access to the regular User Group meetings, the Newsletter, and the e-mailed "updates".

So if you are an IMS professional and you haven’t found the site yet, it’s well worth taking a look. And if you are looking for a way to extend your IMS system, then the product list is an excellent starting point. For vendors, you get a free mention in the News and Tools sections, but you could advertise elsewhere on the site.



Monday 3 March 2008

IBM launches its zX machine

Last week IBM announced its long-awaited z10 mainframe. And it really looks like a good one.

It’s more powerful than the z9 – each box can be fitted with up to 64 purpose-built quad-core processors, so it’s 50 percent faster, delivers up to 100 percent better performance for CPU-intensive jobs, and provides 70 percent more capacity than a z9.

Internal bandwidth has doubled – support for Infiniband means support for 6GB per second (up from 2.7GB per second).

Is it "green"? IBM said that a single System z10 is the equivalent of nearly 1,500 x86 distributed servers. They also suggested it had an 85 percent smaller footprint, and 85 percent lower energy costs. In addition, IBM said that the system allows the consolidation of x86 software licences at up to a 30:1 ratio. Without knowing exactly how much this processor will really cost to buy (I mean, who pays list prices?) it’s difficult to calculate whether there is a saving over 1,500 x86 servers, and if so how much.

Is it modern? Again, IBM suggests users could run, for example, SAP on Linux with DB2 on z/OS all in one box. As shown towards the end of 2007, IBM is working with Sun to bring Solaris to the z10.

Looking at it in more detail, there’s a 991 million transistor processor, which is the quad-core processor, and it comes with 3MB of L2 cache per core. How fast is it? IBM reckons the chip can operate in excess of 4.4GHz. Also available is a separate, dedicated "service" processor, which adds 24MB of L3 cache, which can be shared by all the processor cores.

The top-end z10 processors use five quad-core die packages and two service cores, which if you do the sums comes to 20 cores at 4.4GHz, plus 60MB of L2 cache, plus 48MB of shared L3 cache on a single processor.

Wow!