Traffic Engineering of High-Rate Large-Sized Flowsmv/pubs/confs/2013/HPSR2013/... · Traffic...

8
Traffic Engineering of High-Rate Large-Sized Flows Tian Jin * , Chris Tracy , Malathi Veeraraghavan * , Zhenzhen Yan * * University of Virginia Charlottesville, VA 22904–4743 Email: {tj3sr,mvee,zy4d}@virginia.edu Energy Sciences Network (ESnet), Lawrence Berkeley National Laboratory Berkeley, CA 94720 Email: [email protected] Abstract—High-rate large-sized (α) flows have adverse ef- fects on delay-sensitive flows. Research-and-education network providers are interested in identifying such flows within their networks, and directing these flows to traffic-engineered QoS- controlled virtual circuits. To achieve this goal, a design is proposed for a hybrid network traffic engineering system (HNTES) that would run on an external server, gather NetFlow reports from routers, analyze these reports to identify α-flow source/destination address prefixes, configure firewall filter rules at ingress routers to extract future flows and redirect them to previously provisioned intra-domain virtual circuits. This paper presents an evaluation of this HNTES design using NetFlow reports collected over a 7-month period from four ESnet routers. Our analysis shows that had HNTES been deployed, it would have been highly effective, e.g., > 90% of α-bytes that arrived at the four routers over the 7-month period would have been redirected to virtual circuits. Design aspects such as whether to use /24 subnet IDs or /32 addresses in firewall filters, and which router interfaces’ NetFlow reports to include in the HNTES analysis, are studied. I. I NTRODUCTION Research-and-eduation (REN) network providers have ob- served that high-rate large-sized flows (henceforth referred to as α flows [1]) have adverse effects on delay-sensitive flows. Therefore, there is an interest in identifying these α flows, and redirecting them to traffic-engineered, QoS-controlled paths. RENs have deployed dynamic virtual-circuit (VC) services, using MPLS, to complement their IP-routed service. The opportunity for path selection in the VC setup phase is a means for traffic engineering α flows on lightly loaded links. QoS mechanisms such as weighted fair queueing can be used to isolate α-flow packets into separate virtual queues. As IP routers do not offer built-in capabilities to identify α flows, we proposed a network management software system called hybrid network traffic-engineering system (HNTES) to be run on an external server in our prior work [2]. HNTES conducts an offline analysis of NetFlow reports, which are exported by routers on a periodic basis. NetFlow reports are created by router hardware, which collects information about packets in a flow (identified by source and destination IP addresses, IP protocol field, and TCP/UDP port numbers). Re- peated patterns are observed in source/destination IP addresses because scientists typically use the same servers to download datasets from supercomputing centers. HNTES leverages this pattern to identify α flows by their addresses, and then set firewall filters at ingress routers to direct future flows to provisioned, QoS-controlled virtual circuits. This work focuses on a detailed evaluation of HNTES. It extends our prior evaluation [2], which consisted of NetFlow report analysis from a single router for two months, in three ways. First, we obtained NetFlow reports from 4 different ESnet [3] routers for a longer period of 7 months, and analyzed these reports to determine which router interfaces’ reports to monitor in a deployed HNTES. Second, a tradeoff was examined between the use of /24 and /32 address prefixes in firewall filters, with the former offering higher effectiveness in identifying and directing α-flow packets, but the latter offering a lower negative impact on afflicted-flow packets (non-file- transfer packets that share α-flow address prefixes). Third, we determined whether an offline design was sufficient or whether an online HNTES design was required; an online design would allow for port numbers to be used in firewall filter rules making them more specific and thus avoiding the afflicted-flow problem (as scientific data transfer applications such as GridFTP [4] use ephemeral port numbers, an offline design cannot use port numbers in the firewall filter rules). Our key findings are as follows: (i) Effectiveness (percent- age of α bytes that are identified and directed to virtual circuits) is higher when using NetFlow reports collected at provider-edge (PE) routers, which connect to single customer networks, when compared to core routers, and therefore, we recommend that HNTES use NetFlow reports from PE routers. (ii) We recommend the use of /24 address prefixes for the firewall filter instead of /32. If /32 addresses are used, then there is a higher probability that an α flow is sent to the virtual queue served by IP-routed service where it can negatively impact the delay/jitter of many more non-α flows. On the other hand, if /24 prefixes are used, then a small percentage of non- α flows are subject to the adverse effects of α flows by being directed to the α-flow virtual queue. (iii) As the percentage of afflicted-flow packets is small, and effectiveness of the offline HNTES is high (> 90% for the PE routers), we conclude that an online HNTES is not required. Section II describes the motivating factors for this work. Section III reviews related work. Section IV describes the HNTES design. Our detailed evaluation of offline HNTES using the 4 routers’ NetFlow reports is presented in Section V, and Section VI concludes the paper.

Transcript of Traffic Engineering of High-Rate Large-Sized Flowsmv/pubs/confs/2013/HPSR2013/... · Traffic...

Page 1: Traffic Engineering of High-Rate Large-Sized Flowsmv/pubs/confs/2013/HPSR2013/... · Traffic Engineering of High-Rate Large-Sized Flows ... design would allow for port numbers to

Traffic Engineering of High-Rate Large-Sized FlowsTian Jin!, Chris Tracy†, Malathi Veeraraghavan!, Zhenzhen Yan!

! University of VirginiaCharlottesville, VA 22904–4743

Email: {tj3sr,mvee,zy4d}@virginia.edu† Energy Sciences Network (ESnet), Lawrence Berkeley National Laboratory

Berkeley, CA 94720Email: [email protected]

Abstract—High-rate large-sized (!) flows have adverse ef-fects on delay-sensitive flows. Research-and-education networkproviders are interested in identifying such flows within theirnetworks, and directing these flows to traffic-engineered QoS-controlled virtual circuits. To achieve this goal, a design isproposed for a hybrid network traffic engineering system(HNTES) that would run on an external server, gather NetFlowreports from routers, analyze these reports to identify !-flowsource/destination address prefixes, configure firewall filter rulesat ingress routers to extract future flows and redirect them topreviously provisioned intra-domain virtual circuits. This paperpresents an evaluation of this HNTES design using NetFlowreports collected over a 7-month period from four ESnet routers.Our analysis shows that had HNTES been deployed, it would havebeen highly effective, e.g., > 90% of !-bytes that arrived at thefour routers over the 7-month period would have been redirectedto virtual circuits. Design aspects such as whether to use /24subnet IDs or /32 addresses in firewall filters, and which routerinterfaces’ NetFlow reports to include in the HNTES analysis,are studied.

I. INTRODUCTION

Research-and-eduation (REN) network providers have ob-served that high-rate large-sized flows (henceforth referred toas ! flows [1]) have adverse effects on delay-sensitive flows.Therefore, there is an interest in identifying these ! flows, andredirecting them to traffic-engineered, QoS-controlled paths.

RENs have deployed dynamic virtual-circuit (VC) services,using MPLS, to complement their IP-routed service. Theopportunity for path selection in the VC setup phase is a meansfor traffic engineering ! flows on lightly loaded links. QoSmechanisms such as weighted fair queueing can be used toisolate !-flow packets into separate virtual queues.

As IP routers do not offer built-in capabilities to identify !flows, we proposed a network management software systemcalled hybrid network traffic-engineering system (HNTES) tobe run on an external server in our prior work [2]. HNTESconducts an offline analysis of NetFlow reports, which areexported by routers on a periodic basis. NetFlow reports arecreated by router hardware, which collects information aboutpackets in a flow (identified by source and destination IPaddresses, IP protocol field, and TCP/UDP port numbers). Re-peated patterns are observed in source/destination IP addressesbecause scientists typically use the same servers to downloaddatasets from supercomputing centers. HNTES leverages thispattern to identify ! flows by their addresses, and then set

firewall filters at ingress routers to direct future flows toprovisioned, QoS-controlled virtual circuits.

This work focuses on a detailed evaluation of HNTES. Itextends our prior evaluation [2], which consisted of NetFlowreport analysis from a single router for two months, in threeways. First, we obtained NetFlow reports from 4 differentESnet [3] routers for a longer period of 7 months, and analyzedthese reports to determine which router interfaces’ reportsto monitor in a deployed HNTES. Second, a tradeoff wasexamined between the use of /24 and /32 address prefixes infirewall filters, with the former offering higher effectiveness inidentifying and directing !-flow packets, but the latter offeringa lower negative impact on afflicted-flow packets (non-file-transfer packets that share !-flow address prefixes). Third,we determined whether an offline design was sufficient orwhether an online HNTES design was required; an onlinedesign would allow for port numbers to be used in firewallfilter rules making them more specific and thus avoiding theafflicted-flow problem (as scientific data transfer applicationssuch as GridFTP [4] use ephemeral port numbers, an offlinedesign cannot use port numbers in the firewall filter rules).

Our key findings are as follows: (i) Effectiveness (percent-age of ! bytes that are identified and directed to virtualcircuits) is higher when using NetFlow reports collected atprovider-edge (PE) routers, which connect to single customernetworks, when compared to core routers, and therefore, werecommend that HNTES use NetFlow reports from PE routers.(ii) We recommend the use of /24 address prefixes for thefirewall filter instead of /32. If /32 addresses are used, thenthere is a higher probability that an ! flow is sent to the virtualqueue served by IP-routed service where it can negativelyimpact the delay/jitter of many more non-! flows. On the otherhand, if /24 prefixes are used, then a small percentage of non-! flows are subject to the adverse effects of ! flows by beingdirected to the !-flow virtual queue. (iii) As the percentage ofafflicted-flow packets is small, and effectiveness of the offlineHNTES is high (> 90% for the PE routers), we conclude thatan online HNTES is not required.

Section II describes the motivating factors for this work.Section III reviews related work. Section IV describes theHNTES design. Our detailed evaluation of offline HNTESusing the 4 routers’ NetFlow reports is presented in Section V,and Section VI concludes the paper.

Page 2: Traffic Engineering of High-Rate Large-Sized Flowsmv/pubs/confs/2013/HPSR2013/... · Traffic Engineering of High-Rate Large-Sized Flows ... design would allow for port numbers to

Fig. 1: SNMP link utilization measurements on an ESnet corerouter’s 10 Gb/s interface to another research-and-educationnetwork (REN) on Jan. 16, 2013; the green line showing burstsreaching over 9 Gbps is the outgoing traffic from the ESnetcore router to the REN, while the lower-load blue line is theincoming traffic on the same interface

II. MOTIVATION

As supercomputing speeds increase and storage costs drop,scientists running applications in various disciplines, such ashigh-energy physics, genomics, climate studies, etc., generate,store and move datasets of ever-increasing sizes [5]. Todecrease file transfer times, organizations supporting such sci-entific research invest in high-speed access links and high-endcomputing clusters that are capable of sourcing and sinkingfiles at high rates. Research-and-education networks, suchas the US Department of Energy (DOE)’s Energy SciencesNetwork (ESnet) [3] and Internet2 [6], which connect most ofthe DOE national laboratories and large research universities,respectively, experience sudden surges in traffic caused bylarge scientific dataset transfers.

As an example, Fig. 1 shows a recent burst of trafficobserved on Jan. 16, 2013 at one of the ESnet routers. WhileSNMP link utilization measurements just provide the totalnumber of bytes/sec, this data can be correlated with NetFlowreports. Such analysis has shown that these bursts are mostcommonly caused by a single or few ! flows rather than byan aggregation of many flows.

Next, we present an example of the adverse effects of an! flow on a delay-sensitive flow, and illustrate how QoSmechanisms can be used to isolate these ! flows. In a recentlyaccepted paper [7], we reported on an experiment that wasconducted on a high-speed DOE metropolitan-area testbedconsisting of high-end computing hosts and two IP routers. Alllinks in this network are 10 Gbps Ethernet. Three flows werecreated to compete for resources on a single router’s outgoinginterface: (i) an nuttcp UDP flow that generated packets at 3Gbps, (ii) an nuttcp TCP flow that was initiated at time 53seconds after the start of the experiment, and (iii) a pingflow that sent requests every 1 sec to measure round-trip time.When all three flows were served with best-effort IP service,the TCP flow throughput enjoyed more than 6 Gbps, while theping flow delays increased from 2.3 ms (when the TCP flowwas absent) to 60.6 ms (in the presence of the TCP flow). We

then configured the router to classify packets into two classes,and direct packets of each class to a separate virtual queueon the egress interface. The UDP and ping flows were placedin the first class, and the TCP flow was placed the secondclass. Weighted-fair queueing (WFQ) was used with a 40-60split between the first and second queues, but the transmitterwas configured to operate in a work-conserving mode, whichmeant that if one of the virtual queues did not have packets,the other queue would be served even in excess of its rateallocation. This allowed the TCP flow to still enjoy 6 Gbps,while the ping delay was 2.3 ms even in the presence of theTCP flow.

In summary, if ! flows can be identified at the ingressrouters of a provider’s network and subject to QoS control,the adverse effects of these flows on delay-sensitive flows canbe mitigated.

III. RELATED WORK

Terms such as “elephant” flows have been used to charac-terize large-sized flows by other researchers [8]–[10], whilethe term “! flows” was introduced by Sarvotham et al. [1].Definitions of elephant or ! flows differ in these papersbased on their objectives. Papagiannaki et al. [10] discussedthe potential use of their techniques for identifying elephantflows in traffic engineering applications. More recently, twohardware based solutions for identifying elephant flows withinrouters have been proposed [11], [12].

General methods for traffic classification include port andpayload based techniques, both of which have limitations (portnumbers are ephemeral and payload based techniques arehindered by encryption) [13]. General machine learning tech-niques for traffic classification are of interest in the researchcommunity [14]–[17]; some of these solutions are based onanalysis of NetFlow reports.

These techniques are more complex but have broad applica-bility. In contrast, our proposed technique for HNTES worksfor large scientific data transfers as the servers/clusters usedfor such transfers have static public IP addresses.

IV. HNTES OVERVIEW

In prior work [2], we proposed a hybrid network traffic-engineering system (HNTES) for !-flow identification andredirection to QoS-controlled, traffic-engineered paths. Trafficengineering is used for path selection. Since the setup phase invirtual-circuit (VC) networking allows for path selection, RENproviders, such as ESnet, Internet2, JGN-X, GEANT2, andothers, have deployed a dynamic VC service to complementtheir basic IP-routed service [3]. As ! flows require highrates, the use of VCs would allow the circuit scheduler (calledan Inter-Domain Controller (IDC) [18]) to choose a less-utilized path. The term “hybrid network” in the name HNTESthus denotes a network with both virtual-circuit and IP-routedservices.

HNTES is a network management software system that isdeployed on an external server. It communicates with therouters, IDC, and NetFlow collector (an external server to

Page 3: Traffic Engineering of High-Rate Large-Sized Flowsmv/pubs/confs/2013/HPSR2013/... · Traffic Engineering of High-Rate Large-Sized Flows ... design would allow for port numbers to

which routers export their collected NetFlow reports) withinits own network. Its functions are described below.

!-flow address prefix identification: Periodically HNTESobtains NetFlow reports from the NetFlow collector, andanalyzes these reports to identify the source and destinationIP address prefixes of ! flows. A NetFlow report r has thefollowing parameters:

{Ir, sr, dr, pr, qr, yr, fr, lr, vr, or},

where Ir: ingress router, sr: source IP address, dr: destina-tion IP address, pr: source port number, qr: destination portnumber, yr: protocol type, fr: UTC timestamp of the firstpacket in the report, lr: UTC timestamp of the last packet inthe report, vr: number of packets in the report, and or: totalnumber of octets (bytes) across all packets in the report. Ingeneral, NetFlow reports can be collected on any interface ofa router, but for HNTES, we proposed to collect these reportson external-facing (inter-domain) interfaces in the incomingdirection. Therefore, the router at which NetFlow reports arecollected is necessarily the ingress router at which these flowsenter the provider’s network. As part of NetFlow configuration,administrators set a parameter called active timeout interval,"max, which is the maximum time before the router will exporta report out to the collector. Therefore, for all NetFlow reports

0 ! lr " fr ! "max

A NetFlow report r is said to be an ! NetFlow report if:

or # H (1)

where H is a size threshold. Any flow that has at least one! NetFlow report is classified as an ! flow, and the sourceand destination IP address prefixes {s!r, d!r} corresponding to{sr, dr} is referred to as the flow’s ! prefix ID. Assuming thatHNTES runs on a nightly basis, it creates a list of ! prefixIDs to store in a set Fi, where i is a per-day index.

Configuring routers for future !-flow redirection: Thesource-destination IP address prefix pairs {s!, d!} in Fi areused to set firewall filter rules at each ingress router to separateout packets from future ! flows and redirect them to traffic-engineered, QoS-controlled virtual circuits. While the RENvirtual-circuit services are being developed for inter-domainusage, adoption by providers is proceeding slowly. Therefore,HNTES is currently designed to use only intra-domain circuits.The technological solution of carrying IP packets over MPLSlabel switched paths (LSPs) for segments of an end-to-end pathis leveraged by HNTES. On each day i, HNTES determinesthe egress router E corresponding to each new destinationd! in Fi, and sends requests to the IDC for an LSP, if onedoes not already exist. The IDC executes three steps: (i) setsup the LSP between ingress router I and egress router E,(ii) configures QoS mechanisms such as WFQ scheduling asdescribed in Section II, and (iii) configures a rule in the firewallfilter at router I to identify packets corresponding to {s!, d!}and direct them to the virtual queue served by the MPLS LSP.If an LSP already exists between I and E corresponding to a

Fig. 2: NetFlow reports were obtained from ObservationPoints (OP) for four ESnet routers, router-1, router-2,router-3, router-4

new {s!, d!} entry in Fi, HNTES communicates directly withthe routers to accomplish the actions of steps (ii) and (iii).

Incoming flows on day i whose source and destinationaddresses match one of the ! prefix IDs in the firewallfilter Fi will be automatically classified as ! flows by therouter and directed to the virtual queue for the correspondingMPLS LSP. Thus if ! flows are repeatedly created betweenthe same source-destination hosts/subnets, then the HNTESsolution will be highly effective in isolating ! flows fromother flows. To prevent the firewall filter from growing toolarge, an aging parameter A (e.g., 30 days) is used to deleterules corresponding to which no flows have been observed.Thus, HNTES changes the set Fi on a daily basis.

In summary, the HNTES design uses an offline approach,in which ! prefix IDs are determined through a post analysis,in contrast to an online approach in which ! flows would beidentified from a live analysis of ongoing traffic.

V. EVALUATION OF HNTES

To evaluate HNTES, we obtained NetFlow reports from fourESnet routers for a 7-month period (May-Nov. 2011, a periodof 214 days), and analyzed these reports. The four routerswere carefully selected to represent different roles as shownin Fig. 2. Router-1 and router-2 are provider-edge (PE)routers located in ESnet customers’ sites, and hence connectedto a single customer (DOE national laboratory) network each.Router-3 is a core router connected to multiple ESnet PErouters, and multiple national and international REN peers,such as Internet2 and AARnet. While the REN peers connectto ESnet at some of its other core routers, the ESnet PE routersconnected to router-3 are not connected to any other ESnetrouters. Thus, all packets from/to the set of customer networksconnected to router-3 that are not destined to/sourced fromnetworks within that set pass through router-3. Router-4is one of several ESnet routers used for commercial peering.

Our NetFlow observation points (OP), as shown in Fig. 2,include only the input side of external-facing (inter-domain)interfaces to avoid double counting flows. For example, !flows in which files are being transferred from the customernetwork connected to router-1 will be identified from

Page 4: Traffic Engineering of High-Rate Large-Sized Flowsmv/pubs/confs/2013/HPSR2013/... · Traffic Engineering of High-Rate Large-Sized Flows ... design would allow for port numbers to
Page 5: Traffic Engineering of High-Rate Large-Sized Flowsmv/pubs/confs/2013/HPSR2013/... · Traffic Engineering of High-Rate Large-Sized Flows ... design would allow for port numbers to

Fig. 4 compares the cumulative effectiveness forrouter-1 under the same four aging parameter valuesfor the /24 address prefix case as in Fig. 3. With an agingparameter of 30 days, cumulative effectiveness values areclose to the best-case values when rules are never aged out.Similar results are observed for the other 3 routers. As avalue of A = 30 days offers a good tradeoff between higheffectiveness and firewall filter size, this value is assumed inanalyzing the data for the three goals outlined earlier.

Results " Effectiveness comparison:Row 4 of Table I shows the cumulative effectiveness for

each router for /24 and /32 address prefixes. For all routers,this measure is higher for /24 address prefixes. This is becauseclusters in the same /24 subnet are often used for data transfers,which means that an ! flow from a new host (i.e., one fromwhich there were no previously observed ! flows) will beredirected with /24 prefix based firewall filter rules, but notwith /32 based rules.

Row 4 of Table I also shows that the effectiveness values arelower for router-3 and router-4 when compared to thePE routers, router-1 and router-2. For an explanation,consider the following observations made from the results inRows 4-8 of Table I, Table III, Table II, Fig. 5, and Fig. 6:

TABLE II: Results when aging parameter A = $

router-3 router-4/24 /32 /24 /32

Cumulative Effectiveness, C214 87% 80% 72% 53%# of days when Ei = 1 117 80 99 64# of days when Ei = 0 8 15 22 42

TABLE III: Number of per-day ! NetFlow reports

router1 2 3 4

Min 0 2 0 01st Qu. 27 140.2 8 1Median 68.5 371.5 23.5 3Mean 188.2 619.7 97.7 4.53rd Qu. 195 823.8 106 5.75Max 2337 7345 1411 62

1) The high cumulative effectiveness for the PE routers,router-1 and router-2, for the /24 prefix, shownin Row 4 of Table I is supported by Fig. 5, Fig. 6, andRow 5 of Table I. Fig. 5 shows that the router-1 dailyeffectiveness value is 1 on many days (quantified as 90days in Row 5), which means that a significant fractionof ! flows would have been identified and directed tothe appropriate virtual circuits because of firewall filterentries. This is consistent with Fig. 6, which showsthat daily effectiveness, Ei > 90% for more than 150days for router-1 and more than 130 days even forrouter-3.

0 50 100 150 200

0.0

0.2

0.4

0.6

0.8

1.0

Day number

Dai

ly e

ffect

ivene

ss

Fig. 5: Daily effectiveness for router-1 with /24 prefixesand A = 30

2) The lower cumulativeness effectiveness for router-3and router-4 in Row 4 of Table I is supported by thehigher number of days when Ei = 0 for these routersas seen in Row 6 of Table I, and the larger (0,0.1) barfor router-3 in Fig. 6. The numbers presented in Table IIsuggest that a larger aging parameter at router-3 androuter-4 can be used to improve effectiveness. Giventhe fairly small firewall-filter sizes for these routers seenin Row 1 of Table I, higher number of days in which !flows were not observed at router-3 and router-4(see Row 7 of Table I), and the lower number of !NetFlow reports as seen in Table III (a maximum valueof only 62 at router-4), the firewall filter size shouldbe acceptable even at higher values of the aging interval.

3) There are fewer ! prefix IDs (Row 8 of Table I) butlarger number of days when Ei = 1 at router-3 androuter-4 than at the PE routers for /32 addresses.

4) For /24 addresses, the number of ! prefix IDs is lowerfor router-3 than for router-2 (see Row 8 ofTable I), even though the latter is one of the PE routersconnected to the former.

Explanations for observations:Observation 1: The PE routers are connected to ESnet cus-tomer sites that house supercomputing facilities on whichscientists run their applications and generate datasets. Asscientists repeatedly use these facilities, ! flows occur betweenthe same source-destination pairs. A firewall filter rule createdwith an address prefix pair observed on one day is repeatedlyable to redirect packets from future ! flows.Observation 2: The lower number of ! NetFlow reportsat router-3 and router-4 are because there are feweruploads of large datasets than downloads from ESnet cus-tomer sites. Since these sites are DOE national laboratorieswith the supercomputing centers, more ! flows are likely

Page 6: Traffic Engineering of High-Rate Large-Sized Flowsmv/pubs/confs/2013/HPSR2013/... · Traffic Engineering of High-Rate Large-Sized Flows ... design would allow for port numbers to
Page 7: Traffic Engineering of High-Rate Large-Sized Flowsmv/pubs/confs/2013/HPSR2013/... · Traffic Engineering of High-Rate Large-Sized Flows ... design would allow for port numbers to

“afflicted flows.” The /24 and /32 choices are compared onmeasures related to afflicted-flow packets.

Next, we determine the percentage of afflicted-flow packetsin samples of #-flow packets. The purpose of this analysis isto meet our goal of determining whether an online HNTESis required. An online HNTES that analyzes live traffic couldextract !-flow port numbers, and use these in firewall filterrules, unlike an offline HNTES that can only use sourceand destination addresses/address prefixes. With these more-specific port-based filter rules, there would be no afflictedflows. Thus, if the percentage of afflicted-flow packets is highin the samples of #-flow packets, then we will conclude thatan online HNTES is required. If not, we will conclude that anoffline HNTES is sufficient.

Methodology: On any given day i, set Ai represents the setof ! NetFlow reports as defined in Section V-A. A set Pi of! prefix IDs for day i is defined to include address prefixesof all ! flows observed in set Ai. Then a set Bi of non-!NetFlow reports (denoted by all NetFlow reports that do notcross the H-bytes threshold in (1)) is extracted for day i suchthat %r & Bi, or < H, {s!r, d!r} & Pi. Packets from flowsrepresented by NetFlow reports in set Bi form a sample ofpackets that would be directed to the !-flow virtual queueand its MPLS LSP because they unfortunately share ! prefixIDs. As assumption is made here that all prefix IDs in set Pi

are in the firewall filter (a fair assumption for most days asseen in Fig. 5).

Towards identifying the percentage of non-file-transfer(non-FT) flow packets within set Bi, we apply three steps insequence. First, we extract out NetFlow reports correspondingto ! flows identified by set Ai. Next, we find the set ofNetFlow reports from file transfers using a heuristic. Third,we separate out NetFlow reports from connections with well-known port numbers. These steps are applied in sequence todistinguish flows from scp, a file transfer application thatuses the ssh well-known port number (some of these flowscould fall in the first !-flow category or second non-! flowfile transfer category) from interactive ssh flows, such as thosefrom a remote terminal application such as SecureCRT (thirdcategory). Flows from the third category and the leftoverNetFlow reports are the ones considered to be “afflicted.”

NetFlow reports in sets Bi, 1 ! i ! 214, are classified intofour groups:

• Ci, set of reports from ! flows: r & Ci iff there is areport r! & Ai such that sr = sr! , dr = dr! , pr = pr! ,qr = qr! , and yr = yr! (see Section IV for notation).

• Di, set of reports from other file transfers: r & Di iffr & Bi"Ci, or/vr > 1000 bytes, or > G where G < H ,and there exists another report r! & Bi " Ci such thatsr = sr! , dr = dr! , pr = pr! , qr = qr! , yr = yr! ,or!/vr! > 1000 and or! > G. Observations have shownthat flow reports that meet these criteria are typically fromfile-transfer applications.

• Wi, set of non-FT NetFlow reports with well-known portnumbers: r & Bi "Ci "Di, iff pr or qr is one of several

0 50 100 150 200

050

0000

1000

000

1500

000

2000

000

Day number

num

ber o

f Sam

pled

pac

kets

W+L for /24W+L for /32

Fig. 7: Number of packets in W+L from router-1 reports

well-known port numbers (ssh, http, imap, smtp, ssmtp,https, nntp, imaps, imap4ssl, unidata, rtsp, rsync, sftp,bftp, ftps, pop3 and sslpop)

• Li, set of leftover NetFlow reports, which is Bi " Ci "Di " Wi

Let B, C, D, W, and L be the aggregate set of the correspond-ing per-day sets, e.g., B =

!1"i"214 Bi. Flows corresponding

to the NetFlow reports in set W + L are considered to beafflicted flows.

The two metrics for afflicted-flow analysis are as follows:the daily number of packets in NetFlow reports in set W+L,and percentage of afflicted-flow packets, which is given by

AFi =

"ik=1

"#r$(Wk%Lk)

vr"i

k=1

"#r$(Dk%Wk%Lk)

vr, 1 ! i ! 214. (6)

Unlike in the effectiveness analysis where bytes were used,here packets are used because the estimation of bytes withthe multiplier factor of 1000 is less accurate with non-! flows(recall the 1-in-1000 NetFlow packet sampling rate).

Results: Fig. 7 shows the daily number of afflicted-flowpackets in W + L in router-1, when G is set to 10MB.Similar graphs are observed for the three other routers. Onthis measure, /32 address prefixes in firewall filters enjoys anadvantage over /24 address prefixes because of the former’shigher specificity. This contrasts with the advantage enjoyedby /24 address prefixes over /32 prefixes in the effectivenessmeasure.

TABLE IV: Percentage of afflicted-flow packets, AF214

router1 2 3 4

/24 10.39% 23.84% 6.22% 25.37%/32 11.22% 13.18% 3.43% 25.51%

Page 8: Traffic Engineering of High-Rate Large-Sized Flowsmv/pubs/confs/2013/HPSR2013/... · Traffic Engineering of High-Rate Large-Sized Flows ... design would allow for port numbers to

Table IV shows the second metric, percentage of afflicted-flow packets over the 214-day period. These percentages arenot significantly high even for /24 address prefixes. Further-more, considering that the number of non-! flows that do notshare ! prefix IDs is much higher than that of ! flows, whenthe afflicted-flow packets are considered as a percentage ofthe total number of non-!-flow packets, the relative negativeeffect of using /24 prefixes is even lower.

We conclude therefore that the choice of /24 address prefixesfor the firewall filter is better than /32. If /32 prefixes areused, then there is a higher probability that an ! flow issent to the virtual queue served by IP-routed service whereit can negatively impact the delay/jitter of many more non-!flows. On the other hand, if /24 prefixes are used, then a smallpercentage of non-! flows are subject to the adverse effectsof ! flows by being directed to the !-flow virtual queue andits MPLS LSP.

With regards to online vs. offline HNTES, we concludethat as the percentage of afflicted-flow packets is sufficientlysmall, and effectiveness is high, offline HNTES is a sufficientsolution.

VI. CONCLUSIONS

Towards designing a traffic engineering system for high-ratelarge-sized (!) flows, NetFlow reports were collected at fourESnet routers for 7 months and analyzed. The analysis showsthat it is feasible to deploy an offline hybrid network trafficengineering system (HNTES) that can be highly effective inidentifying and isolating ! flows to mitigate their adverseeffects on other flows. Our analysis led us to conclude thefollowing: (i) offline HNTES is highly effective and does notcause a high afflicted-flow packet percentage necessitating anonline HNTES that can additionally include port numbers infirewall filter rules (afflicted flows are non-file-transfer flowpackets that get directed to !-flow virtual queues and corre-sponding virtual circuits because of shared address prefixes),(ii) the use of /24 address prefixes in firewall filter rules leadsto higher effectiveness than /32 addresses without increasingthe percentage of afflicted-flow packets significantly, and (iii)NetFlow report observation points should include both di-rections of the interfaces connecting provider-edge routers tosingle customer networks rather than core router interfaces.

VII. ACKNOWLEDGMENT

The University of Virginia portion of this work was sup-ported by the U.S. Department of Energy (DOE) grant DE-SC0002350, DE-SC0007341 and NSF grants OCI-1038058,OCI-1127340, and CNS-1116081. The ESnet portion of thiswork was supported by the Director, Office of Science, Officeof Basic Energy Sciences, of the U.S. DOE under ContractNo. DE-AC02- 05CH11231.

REFERENCES

[1] S. Sarvotham, R. Riedi, and R. Baraniuk, “Connection-level analysis andmodeling of nework traffic,” in ACM SIGCOMM Internet MeasurementWorkshop 2001, November 2001, pp. 99–104.

[2] Z. Yan, C. Tracy, and M. Veeraraghavan, “A hybrid network traffic engi-neering system,” in High Performance Switching and Routing (HPSR),2012 IEEE 13th International Conference on, june 2012, pp. 141 –146.

[3] ESnet. [Online]. Available: http://www.es.net/[4] GridFTP. [Online]. Available: http://globus.org/toolkit/docs/3.2/gridftp/[5] “Terabit networks for extreme-scale science workshop re-

port.” [Online]. Available: http://science.energy.gov/!/media/ascr/pdf/program-documents/docs/Terabit networks workshop report.pdf

[6] Internet2. [Online]. Available: http://www.internet2.edu/[7] Z. Yan, M. Veeraraghavan, C. Tracy, and C. Guok, “On how to provision

Quality of Service (QoS) for large dataset transfers,” in Proceedings ofthe Sixth International Conference on Communication Theory, Reliabil-ity, and Quality of Service (CTRQ), 2013.

[8] Kun-chan Lan and John Heidemann, “A measurement study of corre-lations of Internet flow characteristics,” Computer Networks, vol. 50,no. 1, pp. 46–62, 2006.

[9] J. Wallerich, H. Dreger, A. Feldmann, B. Krishnamurthy, and W. Will-inger, “A Methodology for Studying Persistency Aspects of InternetFlows,” ACM SIGCOMM Communication Review, vol. 35, no. 2, 2005.

[10] K. Papagiannaki, N. Taft, S. Bhattacharyya, P. Thiran, K. Salamatian,and C. Diot, “A pragmatic definition of elephants in Internet backbonetraffic,” in IMW ’02 Proceedings of the 2nd ACM SIGCOMM Workshopon Internet measurment, 2002, pp. 175–176.

[11] T. Pan, X. Guo, C. Zhang, J. Jiang, H. Wu, and B. Liuy, “Trackingmillions of flows in high speed networks for application identification,”in INFOCOM, 2012 Proceedings IEEE, March 2012, pp. 1647 –1655.

[12] M. Zadnik, M. Canini, A. Moore, D. Miller, and W. Li, “Trackingelephant flows in Internet backbone traffic with an FPGA-based cache,”in International Conference on Field Programmable Logic and Appli-cations, FPL 2009. 31 2009-Sept. 2 2009, pp. 640 –644.

[13] T. T. T. Nguyen and G. J. Armitage, “A survey of techniques for Internettraffic classification using machine learning,” IEEE CommunicationsSurveys and Tutorials, vol. 10, no. 4, pp. 56–76, 2008.

[14] M. Roughan, S. Sen, O. Spatscheck, and N. Duffield, “Class-of-servicemapping for QoS: a statistical signature-based approach to IP trafficclassification,” in Proceedings of the 4th ACM SIGCOMM conferenceon Internet measurement. ACM, 2004, pp. 135–148.

[15] J. Park, H.-R. Tyan, and C.-C. Kuo, “Internet traffic classificationfor scalable QoS provision,” in IEEE International Conference onMultimedia and Expo, 2006. IEEE, 2006, pp. 1221–1224.

[16] V. Carela-Espanol, P. Barlet-Ros, and J. Sole-Pareta, “Traffic classifi-cation with sampled NetFlow,” Technical Report, UPC-DAC-RR-CBA-2009-6, Feb. 2009.

[17] D. Rossi and S. Valenti, “Fine-grained traffic classification with NetFlowdata,” in Proceedings of the 6th International Wireless Communicationsand Mobile Computing Conference. ACM, 2010, pp. 479–483.

[18] A. Lake, J. Vollbrecht, A. Brown, J. Zurawski, D. Robertson,M. Thompson, C. Guok, E. Chaniotakis, and T. Lehman, “Inter-domain Controller (IDC) Protocol Specification,” May 2008. [Online].Available: https://wiki.internet2.edu/confluence/download/attachments/19074/IDC-Messaging-draft.pdf