Subhlok, Jaspal2014-11-212014-11-21December 22012-12http://hdl.handle.net/10657/791Hadoop has been emerging as a popular distributed framework for data intensive computing in clustered environments. The main usage has been in parallel computing problems where interconnected clusters would transfer parts of the data between individual compute nodes to accomplish one job. The clusters are usually connected with shared network infrastructure where other applications also access and transfer on the same bandwidth. Specifi cally, Hadoop MapReduce jobs su ffer when running in parallel with other tra ffic in the underlying network due to their sensitivity to delay between compute phases. We propose a dynamic priority mechanism realized by OpenFlow protocol on such an infrastructure with a preferred QoS policy over all other tra ffic. Moreover, our proposed priority mechanism can be enhanced if additional network information on traffi c in the underlying network is provided. We propose to use the emerging ALTO (Application Layer Tra ffic Optimization) server to provide network tra ffic information to Hadoop. The ALTO server will be based on the industry standard, IF-MAP (interface to metadata access points protocol), to leverage publish/subscribe capabilities and the flexible schema defi nitions.application/pdfengThe author of this work is the copyright owner. UH Libraries and the Texas Digital Library have their permission to store and provide access to this work. Further transmission, reproduction, or presentation of this work is prohibited except with permission of the author(s).HadoopComputer scienceACCELERATING DATA-INTENSIVE COMPUTATIONS THROUGH DYNAMIC NETWORK TRAFFIC OPTIMIZATION2014-11-21Thesisborn digital