Grid Computing is a term that’s being bandied about more and more in the IT industry – and, unusually for this industry, it’s an extremely important thing to know about, because it might just be the future for the average computer network. The fundamentals of parallel computing
Most of us have come across Symmetric MultiProcessing (SMP) – servers that have more than one processing unit. The idea is that by putting multiple processors in a computer it can do more work, in a shorter time, by allocating different tasks to different processors. All the processors can work “in parallel” – i.e. several bits of work gets done at once because several processors are doing their own particular tasks independently of the others. The next step up is Massively Parallel Processing (MPP) – where you have a sizeable collection of processors, generally, not in the same box, all doing their own bit of work in parallel. The difference between MPP and SMP is that the latter generally uses a small number of processors (eight or less) connected by a common data bus. In contrast, MPP uses an arbitrary number of CPUs linked in an arbitrary way, with some complex processor allocation software built on top to handle the dishing out of work and collation of results. Although not in the same physical box the processors tend to be in a single location, usually in a single room, connected by the highest-speed interconnection available (often Gigabit Ethernet in today’s installations). Parallel computing has two main problems: cost and intercommunication. SMP systems work well because they have a small number of processors interconnected with a single, shared bus running at tens of Gbit/s. MPP systems bring more scalability (you can’t really get above eight processors on a shared SMP bus because the bus becomes a bottleneck) but at the cost of complexity (you have to do more work to handle the intercommunication) and speed (the inter-processor links are usually only 1-2Gbit/s). There comes a point, though, where cost becomes an issue. If you’re putting thousands of processors in a single location, you have to buy those processors and the appropriate interconnection fabric, as well as employing people to keep the lights on and ensure the thing is giving value by being used 24×7. Oh, and you have the task of scheduling access to the resources and the potential of partitioning those resources such that they can handle more than a single project at once – a concept that harks back to the move from centralized mainframes toward distributed desktop computers. The grid approach
Grid Computing is, in its basic form, an extension of the MPP concept. The idea is similar in that you have a set of computers, connected in an arbitrary way, but these systems can live in different buildings, cities or continents. An example is the North Carolina BioGrid which currently spans five of the University of North Carolina’s sites. Grid Computing wipes out the issues of MPP by distributing the processing capability over a number of sites. Instead of having one vast parallel system, you have a number of smaller ones, possibly on different sites, which can be used in whatever combinations are deemed appropriate. So systems can be used independently or pooled into any size of the parallel system depending on demand. And generally speaking, two average-sized systems are less costly to procure than one big one of the same overall power. Where’s this heading?
Let’s map the multi-site concept back into the corporation. The majority of companies don’t have multiple sites with multiple computer systems, but they do have multiple departments, often with dedicated systems. So it’s common to have a server for the accounts system, a server for the ERP system, a server for the email system, and so on. The ERP system’s usually busy during working hours, but not during the night. The email server has its busy periods and its lazy times, too. The accounts system will have some major peaks and troughs – it’ll be hammered to death once a month when it’s payroll time, for instance, and then once a year there’s the annual end-of-year beating. So why don’t we forget the concept of single-purpose systems and instead make our world into a single, heterogeneous “virtual computer”? That way, the ERP software might always run on its own system, but when it’s time to do the payroll run, some of its spare capacity can be used to work out the staff NIC contributions. And if there’s a flurry of email as a result of a big advertising campaign, that’s fine because all that spare time on the server of the account can be used by the email function.