A systematic Characterization of Application Sensitivity to Network Performance

Yüklə 0,74 Mb.

Pdf görüntüsü

səhifə	18/51
tarix	15.10.2018
ölçüsü	0,74 Mb.
	#74178

1 ... 14 15 16 17 18 19 20 21 ... 51

2.6.1 Holt
2.6.2 Chang
2.6.3 Ahn

40
however, is often not captured well by stochastic processes. For example, bulk-synchronous parallel
programs alternate between communication and computation phases; thus all communication occurs
in large bursts. Communication events are not random, nor is there a global steady-state. We shall
examine application behavior in detail when we examine the sensitivity results.
However, we do use a queuing theoretic model in the study of NFS servers. In this case,
the benchmark itself is inﬂuenced by the stochastic model of program behavior. This self-referential
assumptions are part of the reason why the queuing model works well for the benchmark. We will
examine this phenomenon in Section 5.6.
2.6
Related Methodologies
The section describes the methodologies of related work. Although impossible to cover
all related research, we highlight four studies that have results most relevant to our work. We show
that although both the application and design spaces are enormous, nearly all of the studies can be
placed along the two axis of experiment design outlined in Section 1.1.
Where possible, we explain the results of these studies using the LogGP model. Although
most of the studies did not use LogGP, we show that most of the results can be interpreted in a LogGP
framework. Casting other results into a LogGP framework also serves as additional validation of the
model.
2.6.1
Holt
The focus of [52] is quite close in spirit to this thesis. The main questions in that work were
how shared memory parallel programs would respond to different abstract machine parameters. The
study was application-centric and used simulation as the evaluation method. Its abstract parameters
are very close to those of the LogGP model. Because the programs were written using shared mem-
ory, and the machines studied were cc-NUMA designs, the study introduced a new term, occupancy.
In terms of the LogGP model, occupancy of a machine’s cache controller lies somewhere between
overhead and gap. Occupancy can limit the peak message rate. Even with speculative and out-of-
order processors, a high occupancy can also stall the processor, causing an increase in overhead. The
ª
of the Holt model was unchanged that in LogGP.
In addition to a similar network model, Holt’s experiment paradigm was quite similar to
this thesis. Instead of “slowdown”, however, parallel efﬁciency was used as the application-centric

41
metric. Even without the initial run-time, slowdown can be derived from this metric, as they both
share the same denominator. Parallel efﬁciency, however, has a number of problems when used as
a metric compared with slowdown. The primary reason is that it obscures the real question, which
is machine sensitivity to the parameters.
The Holt study found that occupancy, as opposed to latency, was the dominate term af-
fecting parallel program performance. Much like software overhead, occupancy in shared-memory
programs is difﬁcult to tolerate [32]. It found that very high
«
, into the 1000’s of machine cycles,
would reduce efﬁciency by 50%, a factor of 2 in slowdown. However, a much smaller increase in
controller occupancy, into the 100’s of cycles, could reduce efﬁciency by up to 50% as well.
The Holt study also developed a number of analytic models to investigate the effects of
increased latency and occupancy. The simplest models used a simple frequency-cost pair scheme
and ignored contention effects. On a 64 processor machine, this model was off by up to 40%. A
more accurate queuing model reduced the difference between the simulation and the model to less
that 15%. Unfortunately, the model was only used for one simple application, so we can not conclude
about the general accuracy of the model compared to the simulation.
2.6.2
Chang
A recent work [21] examined NFS performance over congested ATM networks. The goal
of this work was to determine the effect of various ATM credit based ﬂow control schemes on NFS.
The work was an application-centric simulation study. A trace-fed simulation from various NFS
tasks (e.g. a compile) was used as the evaluation. The study used run-time of the entire task as the
metric for evaluation, as opposed measuring individual operations of the NFS protocol. The most
related part of the methodology was that the study examined the impact of scaling an abstract param-
eter,
«
, in addition to point-wise comparisons of ﬂow control schemes. The study did not explore
other parameters, however.
The study found that a high
«
, over 10 milliseconds, was detrimental. However, low
«
,
in the
¬
s range, was not found to impact performance. In addition, the study found that a combina-
tion of TCP backoff algorithm and segment sizes can slow performance down by as much as 30%.
Because of the custom workloads, however, making a direct comparison of absolute sensitivities to
our NFS results is difﬁcult.

42
2.6.3
Ahn
This study [1] compared two TCP congestion control and avoidance strategies, TCP-Reno
and TCP-vegas. The study examined the effects of these different congestion avoidance strategies
on FTP trafﬁc. Much like this thesis, a slowdown layer, called the “hitbox”, was interposed under
the TCP/IP stack to emulate WAN links. The work was application-centric and used an emulation
methodology. The hitbox was built as an interposition layer between the IP layer and the device
driver in the BSD operating system. Unlike this thesis, however, the independent variable of the
experiment was not a set of abstract parameters. Rather, the independent variable was the TCP al-
gorithm.
The hitbox can abstract the link bandwidth, propagation delay, and bit error rate. The
methodology to construct the emulator was different from this thesis in that many links were used to
construct the network. That is, each link was designed to emulate a single wide area link, and many
hosts with multiple links were used to emulate a WAN. This approach is contrasts our approach where
we emulate the entire

using a single delay.
The metric used was rather simple, the time to FTP a 512KB ﬁle. In other experiments, the
average of many simultaneous transfers was used as the dependent variable. Although the dependent
variable in the experiment was application-centric, the study attempted to answer several network-
centric questions as well. These included which TCP algorithm transmitted more bytes through the
network, as well as which resulted in longer queues at the switches.
The study found that TCP Vegas can increase delivered bandwidth by 3-5% over the Reno
version. The additional overhead of Vegas over Reno was described but not measured. In addition,
the study found that Vegas resulted in an easier load on the network switches, in terms of offered
bandwidth, than Reno.
In spite of the excellent apparatus, the study was somewhat disappointing because the app-
lication-centric focus was not fully investigated. Only measured competing FTP trafﬁc was the sub-
ject of the study. However, the hitbox emulation system could have measured the impact of different
network designs and algorithms on a wide variety HTTP, NFS and multimedia trafﬁc as well.
The construction of the hitbox raised the issue of the apparatus changing what it’s trying to
measure. Because they did not use a separate network processor to emulate network parameters, the
hitbox itself added communication overhead. The study concluded that the additional overhead was
only 3%, but the calibration methodology was slightly dubious. The study measured the slowdown
of a quicksort while the background idle hitbox was running. A better methodology would have been

Yüklə 0,74 Mb.

Dostları ilə paylaş:

1 ... 14 15 16 17 18 19 20 21 ... 51