A systematic Characterization of Application Sensitivity to Network Performance

Yüklə 0,74 Mb.

Pdf görüntüsü

səhifə	44/51
tarix	15.10.2018
ölçüsü	0,74 Mb.
	#74178

1 ... 40 41 42 43 44 45 46 47 ... 51

117
message passing layers is not impossible, but is certainly a formidable engineering task. The chal-
lenge to future network interface designers will be to reduce overhead and gap while maintaining
connectivity between applications in the existing infrastructure.
One might be tempted to simply add CPU’s and network interfaces in a large SMP box to
decrease the effective gap and Gap, or to amortize the overhead among many processors. However,
such an approach has several limitations. First, the parallelization of a single stream is quite limited
using current operating systems [91]. Thus, In order to obtain a reduced gap, the application has to
parallelize the communication into multiple streams itself. Second, the size of the machine needed
to sustain a very high effective

and
!
is substantial. For example, in order to add just 8 gigabit
network interfaces into a server and use them simultaneously requires 8 separate I/O busses. While
machines of this size do exist, the very high premium attached to this class of machines is well-
known.
7.4
Modeling
We have found that simple models can give “reasonable” performance predictions. The
simple frequency-cost pair overhead models were often close to the measured performance. At worst
they were 50% inaccurate. The results for gap were farther off, and for latency the results are even
more inaccurate. From an architectural standpoint, these results shows that a simple frequency-cost
pair analysis is an adequate “ballpark” measure for a system designer. However, more detailed ap-
plication and system models are needed (e.g. [39, 43]) to make truly accurate predictions across a
range applications and machine conﬁgurations.
The simple models proved useful in evaluating assumptions about application behavior.
For example, the simple gap models showed that communications are bursty in nature for both the
Split-C/AM programs and NPB. These models also showed that serial dependencies can cause hyper-
sensitives to overhead. The radix sort is a prime example of this effect.
The queuing models used for the NFS system are much more accurate than simple fre-
quency cost-pair models used for the other applications. This accuracy, however, is somewhat cir-
cular. The SPECsfs benchmark is built using some assumptions of queuing theory, namely the trafﬁc
is generated as a Poisson process. Given that observed trafﬁc is quite bursty, we would expect actual
NFS trafﬁc to be more sensitive to overhead and Gap than our results showed. However, given that
the observed sensitivity to
!
is very low, even under the worst-case assumption that all messages
are sent in bursts, the small nature of observed NFS requests means that even current LANs will not

118
bandwidth limit NFS.
The one place where the queuing model proved quite useful was in interpreting the results
of vendors’ SFS curves. Section 5.6 gave a small example of how we could compare two servers
given SFS curves and the SPECsfs disclosures. We saw that we could, for example, derive the to-
tal software overhead from observing the base, slope and saturation points. An interesting exercise
would be to see how well the model did on a variety of published curves. However, such a compar-
ison is beyond the scope of this thesis.
7.5
Final Thoughts
Modern computer systems have reached mind boggling complexity. The design of a mod-
ern business server includes sub-systems that are impressive engineering achievements in their own
right: the processor, the memory and I/O system, the operating system, the database and the business
application logic.
To make the example more concrete, imagine the number of designers involved in a 4-way
UltraSPARC III server, running Solaris 7, Oracle 8 and SAP R/3 on 18 GB IBM disk drives, stitched
together with the UPA memory bus, multiple PCI and SCSI busses, and connected to the outside
world via Alteon gigabit Ethernets (each with 2 processors). The number of people involved in the
entire design certainly ranges into the tens of thousands. No one person can hope to understand it
all. Yet, performance analysis of such systems is not an impossible task.
The staggering complexity of such systems will require computer performance analysts to
increasingly use “black-box” methods. In this thesis we investigated one such method in the context
of computer networks. Similar analysis techniques will eventually become an accepted methodology
in computer science, much as they have in the other sciences.
With regard to our ﬁndings, we leave the reader with a short analogy in the hope it will
serve as an aid to recalling our experimental results. The quotations at the beginning of this chapter
parallels an everyday experience many drivers have in the San Francisco Bay Area have to that of
modern computer networks. Commuters often wonder why after opening the billion dollar Cypress
freeway, congestion seems just as bad as when using the previous detour, Interstate 980. On a smaller
scale, computer users wonder why, after installing their new gigabit networks, applications don’t
seem any faster. In both cases, it’s not the freeway or network that is the limiting factor per say.
Rather, it’s the access to the network or freeway: software overhead in computer networks, on-ramps
in the freeway case, that are the real bottlenecks.

119
The SPINE work showed some of the beneﬁts and costs of using more specialized software
to reduce overhead. Although quite successful at reducing overhead, the resulting pipeline was not
faster in terms of gap or latency than a fast CPU running a more standard, but still modiﬁed, TCP/IP
stack. Chapter 6 showed that it an open question is if the overhead reduction obtainable with novel
SAN protocols can be achieved with the much more common Internet protocols.
In the ﬁnal analysis, we can conclude from the results in this thesis that computer systems
are complex enough to warrant our controlled perturbation, emulation-based methodology. We ob-
served that programmers used a variety of latency tolerating techniques and that these work quite
well in practice. However, many of these techniques are still sensitive to software overhead. We
found that without either more aggressive hardware support or the acceptance of radical new proto-
cols, software overheads will continue to limit communication performance across a wide variety of
application domains.

Yüklə 0,74 Mb.

Dostları ilə paylaş:

1 ... 40 41 42 43 44 45 46 47 ... 51