A systematic Characterization of Application Sensitivity to Network Performance

Yüklə 0,74 Mb.

Pdf görüntüsü

səhifə	11/51
tarix	15.10.2018
ölçüsü	0,74 Mb.
	#74178

1 ... 7 8 9 10 11 12 13 14 ... 51

21
version of the apparatus required substantial modiﬁcation to the semantics of the Active Message
layer.
2.3.1
Basic Split-C/AM Apparatus
In this section we ﬁrst describe the hardware used. Next, we provide background on the
GAM layer, which forms the core communication system of our apparatus. We then describe how
we vary the LogGP parameters by engineering controllable delays into GAM. Finally, we brieﬂy
describe how we calibrated the apparatus using a simple microbenchmarking technique.
Hardware
The hardware for all our experiments is a NOW of 35 UltraSPARC Model 170 worksta-
tions (167 MHz, 64 MB memory, 512 KB L2 cache) running Solaris 2.5. Each has a single Myricom
M2F network interface card on the SBUS, containing 128 KB SRAM card memory and a 37.5 MHz
“LANai” processor [17]. The processor runs our custom ﬁrmware, called the LANai Control Pro-
gram (LCP). The LANai processor plays a key role in allowing us to independently vary LogGP
parameters. The machines are interconnected with ten 8-port Myrinet switches (model M2F, 160
MB/s per port) in a two-level fat tree topology. Each of the 7 switches in the ﬁrst level is connected
to ﬁve machines and all three second level switches. At any given time, we only run programs on
32 machines. Often a machine or two was down; a few spares went a long way towards having 32
working machines at any given time.
GAM Active Message Layer
The GAM Active Message layer on Myrinet was developed as an experimental research
prototype. Its primary goal was to deliver high performance communication to parallel applications
on NOWs. Although GAM is not strictly necessary for use in this study, two of its characteristics
proved quite useful. First, its high performance increased the range of the LogGP parameter space
we can consider. Second, its simplicity allowed for easy insertion of delays into various portions of
the system.
The GAM Active Message layer follows a request-reply model. The underlying network is
assumed to be reliable, but only possesses ﬁnite buffering. Because of the ﬁnite buffering, care must
be taken to avoid fetch-deadlock. Deadlock avoidance is achieved by using credit counts between
pairs of nodes. This is the
hi
algorithm described in [32] and similar to the one used in [72]. The

22
Host Processor
o: stall SPARC on
L: set presence
bit at time
Rx+ L
after injection
in Tx loop
g:delay LANai
o: stall SPARC on
∆
Processor
message reception
message send
LANai network
Host Processor
LANai network
Processor
Delay Queue
Rx Queue
Figure 2.2: Varying LogGP Parameters
This ﬁgure describes our methodology for individually varying each of the LogGP parameters. The
interaction between the host processor, the network processor (LANai) and the network is shown for
communication between two nodes.
layer is not thread-safe and requires polling to receive messages. Polls are automatically inserted
when sending messages, however.
In addition to requests and replies, messages are typed as short or long. Short messages
are up to 6 words in length, with one word consumed as a function handler. Long messages contain
a function pointer, two words for function arguments and a block of data up to 4KB long. Short and
long messages are orthogonal to requests and replies. Thus, a short or long message may be sent in
response to either type of request. A library function performs the packetization for direct memory-
copy requests longer than 4KB. Note that in the GAM speciﬁcation [30] there is not an arbitrarily
long reply bulk-transfer function; replies in the Myrinet apparatus are limited to 4KB.
Varying the LogGP Parameters
The key experimental innovation is to build adjustments into the communication layer so
that it can emulate a system with arbitrary latency, overhead, gap and Gap. Our technique is depicted
in Figure 2.2 which illustrates the interaction of the host processor, the LANai (network interface
processor) and the network for communication between two nodes. The next sections describe how
we varied each parameter in detail.
Overhead
The majority of the overhead is the time spent writing the message into the network
interface or reading it from the interface. Thus, varying the overhead,
p
, is straightforward. For
each message send and before each message reception, the operation is modiﬁed to loop for a speciﬁc

23
period of time before actually writing or reading the message.
gap and Gap
The gap is dominated by the message handling loop within the network processor.
Thus, to vary the gap,
q
, we insert a delay loop into the LCP message injection path after the message
is transferred onto the wire and before it attempts to inject the next message. Since the stall is done
after the message is actually sent, the network latency is unaffected. Also, since the host processor
can write and read messages to or from the network interface at its normal speed, overhead should not
be affected. We use two methods to prevent excessive network blocking from artiﬁcially affecting
our results. First, the LANai is stalled at the source rather than the destination. Second, the ﬁrmware
takes advantage of the LANai’s dual hardware contexts; the receive context can continue even if the
transmit context is stalled.
To adjust
r
, the transmit context stalls after injecting a fragment (up to 4KB) for a period
of time proportional to the fragment size. We stall the LCP for an adjustable number of microseconds
for each 100 bytes of up to a 4 KB fragment. For example, if the Gap “knob” was set to 11, we would
stall the LANai transmit context for an extra 11
s
s for each 100 bytes of data in a fragment.
Latency
The latency,
t
, requires care to vary without affecting the other LogGP characteristics. It
includes time spent in the network interface’s injection path, the transfer time, and the receive path,
so slowing either the send or receive path would increase
t
. However, modifying the send or receive
path would have the side effect of increasing
q
. Our approach involves adding a delay queue inside
the LANai. When a message is received, the LANai deposits the message into the normal receive
queue, but defers setting the ﬂag that would indicate the presence of the message to the application.
The time that the message “would have” arrived in the face of increased latency is entered into a
delay queue. The receive loop inside the LANai checks the delay queue for messages ready to be
marked as valid in the standard receive queue. Modifying the effective arrival time in this fashion
ensures that network latency can be increased without modifying
u
or
q
.
Calibration
With any empirical apparatus, as opposed to a discrete simulator, it is important to cali-
brate the actual effect of the settings of the input parameters. In this study, it is essential to verify
that our technique for varying LogGP network characteristics satisﬁes two criteria: ﬁrst, that the
communication characteristics are varied by the intended amount and second that they can be varied
independently.

Yüklə 0,74 Mb.

Dostları ilə paylaş:

1 ... 7 8 9 10 11 12 13 14 ... 51