Electronic Theses and Dissertations (2010 - Present)
Permanent URI for this communityhttps://hdl.handle.net/10657/1
The University of Houston Libraries collect and make publicly available all electronic theses and dissertations (ETDs) produced in UH graduate and PhD programs through the UH institutional repository. ETDs become available after the student submits them to the UH Graduate School, the document is approved by all appropriate parties, and any embargo on the document expires.
Collection Scope
UH Libraries began publishing ETDs from several UH Colleges in 2010. As of Summer 2014, all UH Colleges that require a thesis or dissertation for graduation began submitting these documents in electronic format. Below is a list of UH Colleges that currently participate in the ETD program and their coverage dates in this repository.
UH College | Coverage Dates |
---|---|
C.T. Bauer College of Business | 2010-Present |
Cullen College of Engineering | 2012-Present |
Conrad N. Hilton College of Hotel and Restaurant Management | 2015-Present |
College of Education | 2010-Present |
College of Liberal Arts and Social Sciences | 2012-Present |
College of Natural Sciences and Mathematics | 2012-Present |
College of Optometry | 2010-Present |
College of Pharmacy | 2010-Present |
College of Technology | 2012-Present |
K. G. McGovern College of the Arts | 2016-Present |
G. D. Hines College of Architecture & Design | 2016-Present |
Graduate College of Social Work | 2012-Present |
Additional Information
- Online access for content outside these coverage dates may be available electronically through ProQuest.
Note: As of Fall 2017, all theses and dissertations produced at UH will be submitted to ProQuest. Additionally, some UH Colleges have contributed content to ProQuest at different periods of time in the past. - For print theses and dissertations found outside these coverage dates, please consult UH Libraries’ catalog.
- Additional information on submitting ETDs can be found at the UH Graduate School.
Questions?
Feel free to contact us should you have any questions or comments.
Browse
Browsing Electronic Theses and Dissertations (2010 - Present) by Department "Computer Science, Department of"
- Results Per Page
- Sort Options
Item 3D facial modeling with geometric wrinkles from images(2023-04-27) Deng, Qixin; Deng, Zhigang; Pavlidis, Ioannis T.; Chen, Guoning; Mayerich, DavidRealistic 3D facial modeling and reconstruction have been increasingly used in many graphics, animation, and virtual reality applications. Currently many existing face models are not able to present rich details while deforming, which means lack of wrinkles while face shows different expressions. Also, to create a realistic face model for an individual is also needs complex setup and sophisticated works from experienced artists. The goal of this dissertation is to achieve an end-to-end system to augment coarse-scale 3D face models, and to reconstruct realistic face from in-the-wild images. I propose an end-to-end method to automatically augment coarse-scale 3D faces with synthesized fine scale geometric wrinkles. I define the wrinkle as the displacement value along the vertex normal direction, and save it as displacement map. The distribution of wrinkles has some spatial characteristics, and deep convolutional neural network (DCNN) is pretty good at learning spacial information across image-format data. I labeled the wrinkle data with its identity and expression vectors. By formulating the wrinkle generation problem as a supervised generation task, I implicitly model the continuous space of face wrinkles via a compact generative model, such that plausible face wrinkles can be generated through effective sampling and interpolation in the space. Then I introduce a complete pipeline to transfer the synthesized wrinkles between faces with different shapes and topologies. The method can augment an exist 3D face model with more fine-scale details, but to create a realistic human face model is not yet solved. Properly modeling complex lighting effects in reality, including specular lighting, shadows, and occlusions, from a single in-the-wild face image is still considered as a widely open research challenge. To reconstruct an realistic face model from an unconstrained image, I propose a CNN based framework to regress the face model from a single image in the wild. I designed novel hybrid loss functions to disentangle face shape identities, expressions, poses, albedos, and lighting. The outputted face model includes dense 3D shape, head pose, expression, diffuse albedo, specular albedo, and the corresponding lighting conditions.Item 3D Reconstruction of Tubular Structures Using MRI Projection Images(2018-05) Unan, Mahmut 1986-; Tsekos, Nikolaos V.; Shah, Dipan J.; Leiss, Ernst L.; Shi, WeidongAfter imaging information became available in digital form, techniques for acquiring volumetric data evolved. 3D reconstruction is mostly performed using multislice stack images. The objective of this dissertation is to introduce a simple magnetic resonance technique for imaging tubular structures, such as blood vessels and catheters, and 3D reconstruction of these structures. This study includes three major chapters: one on simulation and two on experiments with MRI projection images. First, a MATLAB simulation was created to analyze the reconstruction process; it was tested with different shapes of the structures and different numbers of projections. Second, triplanar projection imaging was evaluated on a phantom filled with a T1-shortening, Gd-based contrast agent embedded into a lipid matrix. The object is reconstructed from three mutually orthogonal projections of the volume that contain the structure of interest. The projected structures of the object were segmented out on each projection, back-projected to generate the segmented tubular object, and mesh-rendered in 3D. The accuracy of this approach was investigated by comparing the mesh-rendered tubular structure generated from projections with the mesh rendered from a multislice set of images of the same volume. Third, Inverse Radon Transform was implemented for 3D reconstruction of complex helical tubular structure from multiple radially deployed (oblique) projections. To compute the correctness of the 3D reconstruction processes, we compared the resulting meshes with the multislice-rendered meshes. Hausdorff distance and Point Cloud Comparison methods were used to evaluate the reconstruction error. The average error was less than 1 pixel for the triplanar projection images, and it was less than 2 pixels for the oblique orientation projection images. With further optimization and reduction of acquisition time, this method can be used for 3D fast imaging of interventional tools or segments of blood vessels with applications in interventional MRI.Item A Checkpointing Restart Approach for OpenSHMEM Fault Tolerance(2016-05) Hao, Pengfei 1989-; Chapman, Barbara M.; Shamis, Pavel; Gabriel, EdgarThe Partitioned Global Address Space (PGAS) has emerged recently for parallel programming at large scale. The PGAS ecosystem contains libraries, and languages (often implemented atop those libraries). One such library is OpenSHMEM, which offers an intuitive and easy-to-use API. OpenSHMEM's main feature is one-sided communication: in which communication and computation can be overlapped easily. Performing computational science at large scale requires a resilient computing environment. Current computer systems, although generally reliable, do suffer from occasional faults. As the size of leadership high performance computing systems trends towards Exascale, the presence of faults will lead to system failures that cause fatal software failures. To mitigate against this problem requires software resilience, or "fault tolerance." One common approach is to checkpoint and restart from a known good state when an error is detected. A long-running (e.g., weeks or months) program without fault tolerance will suffer from failure-restart cycles, which introduces unacceptably lengthy, uncertain execution times, and hugely increased resource usage. In this thesis work, we explore a fault tolerance scheme based on check-point and restart that is specialized for the needs of PGAS programming models, using OpenSHMEM as a concrete implementation. Using a 1-D Jacobi code, we show that this kind of approach is scalable and can save considerable resource usage. Ideas for more general solutions and other approaches are presented as future work.Item A Clinical Protocol to Validate a Breast Conservative Therapy Outcome Model(2013-08) Lepoutre, Nicole G. 1990-; Garbey, Marc; Hilford, Victoria; Bass, Barbara L.Breast cancer is the most common cancer among women around the world. Regarding the different treatments modalities for breast cancer, one is Breast Conservative Therapy (BCT) where the complete tumor and a margin of healthy tissues are surgically removed and the remaining breast tissue receives radiotherapy. From a patient point of view, preserving the shape of her breast is essential for her quality of life and hence, BCT may be a good treatment option. Although the outcome of this treatment usually provides good clinical results, some cosmetic defects like asymmetry may emerge. The goal of our team is to offer surgeons a tool that predicts the shape of the breast after BCT. Therefore, a multiscale model has been implemented by members of our team and is being validated and fine-tuned. A clinical study has been designed. The objective of this Master's thesis is to determine whether the predicted shape of the breast is realistic or not by: - acquiring numerical data during the patients' follow-up, - providing a confident 3D-reconstruction of the breast at certain point in time, - understanding the main steps during the healing process.Item A Code Structure Visualization Tool for Groovy(2013-12) Saha, Manas K. 1966-; Subramaniam, Venkat; Shah, Shishir Kirit; Ramamurthy, Uma; Subhlok, JaspalReal world systems often turn complex by nature. Dealing with complexity takes great amount of effort and time. A visualization tool can help to understand the code structure and ease the efforts. This work is an attempt to build a tool to visualize the code structure. The code structure of a program is represented by an Abstract Syntax Tree (AST). A language like Groovy provides easy way to tap into that structure. Furthermore, features like metaprogramming can help to easily decipher the structural information. That makes Groovy a natural choice for creating such a tool on the Java Virtual Machine (JVM). The visualization tool we developed as a part of this thesis shows the hierarchical structure of the entire program as well as just selected parts of a large complex code. Using its features, programmers can visually navigate the code structure to inspect and understand how the program is organized. The tool not only displays the structure, but also can dynamically display the structure altered using compile time metaprogramming.Item A Compiler Optimization Framework for Directive-Based GPU Computing(2016-08) Tian, Xiaonan 1983-; Chapman, Barbara M.; Gabriel, Edgar; Subhlok, Jaspal; Shi, Weidong; Rodgers, GregoryIn the past decade, accelerators, commonly Graphics Processing Units (GPUs), have played a key role in achieving Petascale performance and driving efforts to reach Exascale. However, significant advances in programming models for accelerator-based systems are required in order to close the gap between achievable and theoretical peak performance. These advances should not come at the cost of programmability. Directive-based programming models for accelerators, such as OpenACC and OpenMP, help non-expert programmers to parallelize applications productively and enable incremental parallelization of existing codes, but typically will result in lower performance than CUDA or OpenCL. The goal of this dissertation is to shrink this performance gap by supporting fine-grained parallelism and locality-awareness within a chip. We propose a comprehensive loop scheduling transformation to help users more effectively exploit the multi-dimensional thread topology within the accelerator. An innovative redundant execution mode is developed in order to reduce unnecessary synchronization overhead. Our data locality optimizations utilize the different types of memory in the accelerator. Where the compiler analysis is insufficient, an explicit directive-based approach is proposed to guide the compiler to perform specific optimizations. We have chosen to implement and evaluate our work using the OpenACC programming model, as it is the most mature and well-supported of the directive-based accelerator programming models, having multiple commercial implementations. However, the proposed methods can also be applied to OpenMP. For the hardware platform, we choose GPUs from Advanced Micro Devices (AMD) and NVIDIA, as well as Accelerated Processing Units (APUs) from AMD. We evaluate our proposed compiler framework and optimization algorithms with SPEC and NAS OpenACC benchmarks; the result suggests that these approaches will be effective for improving overall performance of code executing on GPUs. With the data locality optimizations, we observed up to 3.22 speedup running NAS and 2.41 speedup while running SPEC benchmarks. For the overall performance, the proposed compiler framework generates code with competitive performance to the state-of-art of commercial compiler from NVIDIA PGI.Item A Computational Framework for Finding Interestingness Hotspots in Spatial Datasets(2016-12) Akdag, Fatih 1982-; Eick, Christoph F.; Gabriel, Edgar; Chen, Guoning; Solorio, Thamar; Choi, YunsooThe significant growth of spatial data increased the need for automated discovery of spatial knowledge. An important task when analyzing spatial data is hotspot discovery. In this dissertation, we propose a novel methodology for discovering interestingness hotspots in spatial datasets. We define interestingness hotspots as contiguous regions in space which are interesting based on a domain expert’s notion of interestingness captured by an interestingness function. We propose computational methods for finding interestingness hotspots in point-based and polygonal spatial datasets, and gridded spatial-temporal datasets. The proposed framework identifies hotspots maximizing an externally given interestingness function defined on any number of spatial or non-spatial attributes using a five-step methodology, which consists of: (1) identifying neighboring objects in the dataset, (2) generating hotspot seeds, (3) growing hotspots from identified hotspot seeds, (4) post-processing to remove highly overlapping neighboring redundant hotspots, and (5) finding the scope of hotspots. In particular, we introduce novel hotspot growing algorithms that grow hotspots from hotspot seeds. A novel growing algorithm for point-based datasets is introduced that operates on Gabriel Graphs, capturing the neighboring relationships of objects in a spatial dataset. Moreover, we present a novel graph-based post-processing algorithm, which removes highly overlapping hotspots and employs a graph simplification step that significantly improves the runtime of finding maximum weight independent set in the overlap graph of hotspots. The proposed post-processing algorithm is quite generic and can be used with any methods to cope with overlapping hotspots or clusters. Additionally, the employed graph simplification step can be adapted as a preprocessing step by algorithms that find maximum weight clique and maximum weight independent sets in graphs. Furthermore, we propose a computational framework for finding the scope of two-dimensional point-based hotspots. We evaluate our framework in case studies using a gridded air-pollution dataset, and point-based crime and taxicab datasets in which we find hotspots based on different interestingness functions and we give a comparison of our framework with a state of the art hotspot discovery technique. Experiments show that our methodology succeeds in accurately discovering interestingness hotspots and does well in comparison to traditional hotspot detection methods.Item A Computational Framework to Understand Vascular Adaptation(2015-05) Rahman, Mahbubur 1982-; Garbey, Marc; Berceli, Scott A.; Tsekos, Nikolaos V.; Gabriel, Edgar; Hilford, VictoriaResearchers have been continuously applying a wide variety of approaches to understand vascular adaptation over the past two decades. However, the specific cause/effect or links between the hemodynamic factors, inflammatory biochemical mediators, cellular effectors and vascular occlusive phenotype remain unexplained still today. To explain these biological phenomena, we have introduced a multi-scale computational framework to systematically test many hypotheses associated with the vascular adaptation and finally applied this framework to explain some widely observed clinical and experimental cases. Our framework incorporates the cellular activities inside the vein graft influenced by the shear stress and tension, which are two of the most important environmental factors in the vascular adaptation. This is a hybrid agent based model (ABM) coupled with the partial differential equations (PDEs) associated with the calculation of the shear stress. Based on the computational framework, we have designed and developed a modular, adaptive, efficient and scalable simulation program so that we can explain some specific pattern formations associated with the vascular adaptation by pattern recognition algorithms of the framework in real time. Finally, we have coupled a genetic algorithm with the framework to verify the fact that a combination of interesting patterns associated with the vascular adaptation can be regenerated in a multivariate data analysis environment. As a result, this research will reduce the gap in understanding different cases observed in the vascular adaptation.Item A Computational Image-Based Guidance System for Precision Laparoscopy(2016-12) Nguyen, Toan B. 1984-; Tsekos, Nikolaos V.; Pavlidis, Ioannis T.; Vilalta, Ricardo; Garbey, MarcThis dissertation presents our progress toward the goal of building a computational image-based guidance system for precision laparoscopy; in particular, laparoscopic liver resection. As we aim to keep our working goal as simple as possible, we have focused on the most important questions of laparoscopy - predicting the new location of tumors and resection plane after a liver maneuver during surgery. Our approach was to build a mechanical model of the organ based on pre-operative images and register it to intra-operative data. We proposed several practical and cost-effective methods to obtain the intra-operative data in the real procedure. We integrated all of them into a framework on which we could develop new techniques without redoing everything. To test the system, we did an experiment with a porcine liver in a controlled setup: a wooden lever was used to elevate a part of the liver to access the posterior of the liver. We were able to confirm that our model has decent accuracy for tumor location (approximately 2 mm error) and resection plane (1% difference in remaining liver volume after resection). However, the overall shape of the liver and the fiducial markers still left a lot to be desired. For further corrections to the model, we also developed an algorithm to reconstruct the 3D surface of the liver utilizing Smart Trocars, a new surgical instrument recognition system. The algorithm had been verified by an experiment on a plastic model using the laparoscopic camera as a mean to obtain surface images. This method had millimetric accuracy provided the angle between two endoscope views is not too small. In an effort to transit our research from porcine livers to human livers, in-vivo experiments had been conducted on cadavers. From those studies, we found a new method that used a high-frequency ventilator to eliminate respiratory motion. The framework showed the potential to work on real organs in clinical settings. Hence, the studies on cadavers needed to be continued to improve those techniques and complete the guidance system.Item A Computational Study of Visual Attention on Objects and Gestures during Infancy(2017-08) Mirsharif, Seyyedeh Qazale 1986-; Shah, Shishir Kirit; Subhlok, Jaspal; Gabriel, Edgar; Yoshida, HanakoUnderstanding the pathway to the development of visual attention and the role of vision in object name learning during infancy have been one of the focus of developmental studies over the years. Head cameras have been increasingly used in such studies as they provide a unique source of information about child's momentary visual experiences by approximating child's visual field and may yield new insights into what factors generate attention in infants. However, frame by frame analysis of such videos is cumbersome and time consuming and several parameters that impact child's visual attention such as constant motion of camera cannot be assessed by human analysis. In this thesis, we propose computer vision tools to help developmental scientists perform automated, fast and accurate analysis on videos collected from child-parent tabletop toy play. The computer vision tools in this thesis are used to further our understanding of the development of visual attention in children on objects and gestures. In the first stage of this thesis, we propose a semi-automated method for object segmentation in child's egocentric videos. The method is applied to large volume of videos and obtain binary masks of toy objects that are being used during child-parent toy play. Afterwards, the object masks are used to study how much of the time children visually attend to objects at progressive ages and the location of objects within their visual field. In the second stage, we propose an automated tool for analysis of motion patterns and parent's gestures in videos that are collected from child-parent toy play from third perspective and eye bird view. The proposed method employs an unsupervised clustering approach for clustering the videos into multiple groups by extracting dense trajectories from image sequences and using k-means clustering. Each motion group is further explored to study potential correlations of motion patterns in parent's gestures with object saliency in child's visual field. The proposed methods in this thesis, enable developmental scientists to explore unknown patterns in the development of child's visual by performing automated and accurate analysis on videos of child-parent toy play obtained from multiple views.Item A Fast Clustering Algorithm Merging The Expectation Maximization Algorithm and Markov Chain Monte Carlo(2015-05) Matusevich, David Sergio 1969-; Ordonez, Carlos; Eick, Christoph F.; Azencott, RobertClustering is an important problem in Statistics and Machine Learning that is usually solved using Likelihood Maximization methods, of which the Expectation-Maximization algorithm (EM) is the most common. In this work we present an algorithm merging Markov Chain Monte Carlo methods with the EM algorithm to find qualitatively better solutions for the clustering problem. We present brief introductions to two popular clustering algorithms, K-Means and EM, as well as the Markov Chain Monte Carlo algorithm. We show how these algorithms can be combined and incorporated into a Database Management System (DBMS) using a combination of SQL queries and User Defined Functions (UDFs). Even though SQL is not optimized for complex calculations, as it is constrained to work on tables and columns, it is unparalleled in handling all aspects of storage management, security of the information, fault management, etc. Our algorithm makes use of these characteristics to produce portable solutions that are comparable to the results obtained by other algorithms and are more efficient since the calculations are all performed inside the DBMS. To simplify the calculation we use very simple scalar UDFs, of a type that is available in most DBMS. The solution has linear time complexity on the size of the data set and it has a linear speedup with the number of servers in the cluster. This was achieved using sufficient statistics and a simplified model that assigns the data-points to different clusters during the E-step in an incremental manner and the introduction of a Sampling step in order to explore the solution space in a more efficient manner. Preliminary experiments show very good agreement with standard solutions.Item A FRAMEWORK ARCHITECTURE FOR SHARED FILE POINTER OPERATIONS IN OPEN MPI(2013-05) Vanegas, Carlos R. 1980-; Gabriel, Edgar; Subhlok, Jaspal; Chaarawi, MohamadMPI is a message passing interface that provides a distributed memory programming model for parallel computation. MPI-I/O is a parallel I/O library that is part of the MPI-2 specification. As an intermediate layer between the application and the file system, MPI-I/O is able to support features not directly enabled in the underlying file system. One such feature is a shared file pointer. A shared file pointer is a file pointer to an open file that is shared between the processes that opened the file. The objective of this thesis is to develop a framework for shared file pointer operations in OMPIO for MPI-I/O, implement and evaluate various existing and new algorithms for shared file pointers, and to develop and evaluate a selection logic to decide which algorithm to use. Four algorithms are implemented in this thesis: the locked file, the shared memory, the individual file, and the additional process algorithms. The results show that the shared memory algorithm is the fastest. Unfortunately, it can only be used if all of the processes are executing on a single node. The individual algorithm provides a good option when running on multiple nodes, but only supports write operations. The additional process algorithm can be run from multiple nodes and can be used on all file systems, but may not be supported on all environments due to the requirement of spawning an additional process.Item A Framework for Interactive Immersion into Imaging Data Using Augmented Reality(2022-04-25) Velazco, Jose D.; Tsekos, Nikolaos V.; Leiss, Ernst L.; Eick, Christoph F.; Navkar, Nikhil V.; Webb, Andrew G.Image acquisition scanners produce an ever-growing amount of 3D/4D multimodal data that requires extensive image analytics and visualization of collected and generated information. For the latter, augmented reality (AR) with head-mounted displays (HMDs) has been commended as a potential enhancement. This PhD describes a framework (FI3D) for interactive and immersive experiences using an AR interface powered with image processing and analytics. The FI3D was designed to communicate with peripherals, including imaging scanners and HMDs, and to provide computational power for data acquisition and processing. The core of the FI3D is deployed in a dedicated unit that executes the computationally demanding processes in real-time; the HMD is used as an IO interface of the system. The FI3D is customizable and allows users to integrate different workflows while incorporating third party libraries. Using the FI3D as a foundation, two applications were developed in the cardiac and urology medical domains to experiment with, test, and validate the system. First, cine MRI images were segmented using a machine learning model while simultaneously an HMD rendered the reconstructed surfaces. Secondly, a simulated environment for robotic assisted MRI-guided transrectal prostate biopsies was developed, and user studies were conducted to evaluate the feasibility of AR visualization and interaction using the HoloLens HMD. Performance results showed that the system can maintain an image stream of five images with a resolution of 512 x 512 per second and update visual properties of the holograms at 1 update per 16 milliseconds. Interactive studies showed that using a gaming joystick allowed the manipulation of a robotic structure more effectively than using holographic menus or a mouse and keyboard. The FI3D can serve as the foundation for medical applications that benefit from AR visualization, removing various technical challenges from the developmental pipeline. The versatility, immersive, and interactive experience offered by the AR interface may assist physicians with diagnosis and image-guided interventions, resulting in safer and faster procedures. This can further increase the accessibility of healthcare to the public, yielding an increase in patient throughput.Item A Framework for Measuring and Improving VR Competency(2023-08) Holtkamp, Brian M; Subhlok, Jaspal; Yun, Chang; Loveland, Katherine A.; Huang, StephenAs Virtual Reality (VR) becomes more accessible and a utilized medium for research, training, and education, subjects who are not familiar with VR experiences must learn how to operate within VR to engage with the educational material, exhibit training experiences, and provide usable data for researchers. A subjects' performance can be affected by their level of familiarity with VR and the impact could appear in analysis based on those performances. This work proposes a framework to identify, measure, and attempt to improve "VR Competency", the ability for a person to understand and utilize a VR experience successfully. The framework focuses on three aspects to master: utilizing VR hardware, performing VR interactions, and understanding instructions to achieve the experience's goal. This dissertation covers three studies by exploring each for its successes and failures, analyze the interactions used, and draw conclusions within the context of the VR framework. The first study explored using VR for fire safety training where subjects would learn from the Community Emergency Response Team (CERT) fire safety module and had to apply the techniques in a simulated house fire. The experience successfully recreated the fire extinguisher operation section of the fire module, but subjects reported interaction frustrations caused from an improperly built tutorial that caused some subjects to not complete the trial. The second study explored using augmented reality (AR) to compare subject matter expertise assessment from an AR assessment tool to a traditional pen-and-paper assessment. This study shows that the AR assessment tool was comparable as an subject matter expertise assessment to the pen-and-paper assessment, but user interface and interaction problems posed difficulties for subjects. The final study explored the instruction aspect of the framework by providing different instructional techniques to subjects performing a robot assembly task. The study showed no statistically significant impact on performance, but subjects preferred hologram-based instructional techniques and performed the fastest on average. As the framework continues to be defined through existing and future works, the results would be used to establish best practices and techniques for future designers and experimenters to utilize to build the best VR experiences possible.Item A General Summarization Matrix for Scalable Machine Learning Model Computation in the R Language(2019-05) Chebolu, Siva Uday Sampreeth 1995-; Ordonez, Carlos; Eick, Christoph F.; Kaiser, KlausData analysis is an essential task for research. Modern large datasets indeed contain a high volume of data and may require a parallel DBMS, Hadoop Stack, or parallel clusters to analyze them. We propose an alternative approach to these methods by using a lightweight language/system like R to compute Machine Learning models on such datasets. This approach eliminates the need to use cluster/parallel systems in most cases, thus, it paves the way for an average user to effectively utilize its functionality. Specifically, we aim to eliminate the physical memory, time, and speed limitations, that are currently present within packages in R when working with a single machine. R is a powerful language, and it is very popular for its data analysis. However, R is significantly slow and does not allow flexible modifications, and the process of making it faster and more efficient is cumbersome. To address the drawbacks mentioned thus far, we implemented our approach in two phases. The first phase dealt with the construction of a summarization matrix, Γ, from a one-time scan of the source dataset, and it is implemented in C++ using the RCpp package. There are two forms of this Γ matrix, Diagonal and Non-Diagonal Gamma, each of which is efficient for computing specific models. The second phase used the constructed Γ Matrix to compute Machine Learning models like PCA, Linear Regression, Na¨ıve Bayes, K-means, and similar models for analysis, which is then implemented in R. We bundled our whole approach into a R package, titled Gamma.Item A High-Level Programming Model for Embedded Multicore Processors(2013-08) Aribuki, Ayodunni 1982-; Leiss, Ernst L.; Stotzer, Eric; Gabriel, Edgar; Johnsson, Lennart; Pâris, Jehan-FrançoisTraditionally, embedded programmers have relied on using low-level mechanisms for coordinating parallelism and managing memory. This is typically a herculean task, especially considering that this approach is processor-specific and requires that the process must be redone to target different deployment processors. As multicore technology becomes more prevalent in embedded systems, high-level approaches are being sought to reduce programmers' burden as they write code for more complex multicore systems. This dissertation explores implementing a high-level shared-memory parallel programming model for embedded multicore processors. The processor representative of this type that is used for this work is the TMS320C6678 (also referred to as C6678) digital signal processor (DSP) manufactured by Texas Instruments. The C6678 is a high-performance fixed and floating-point DSP that comprises eight DSP core subsystems. In addition to external memory, it has roughly 8MB of on-chip memory, most of which may be configured as either cache or scratchpad. When a portion of its local on-chip memory is configured as cache, software-controlled mechanisms must be used to manage the coherence of shared data that is cached in core-local memories. When the same memory is configured as scratchpad, software-controlled mechanisms are also necessary to manage data movements between memory segments within the memory hierarchy. This memory organization brings additional challenges when developing applications for the C6678 as well as other processors with similar memory setups. In this dissertation, we present a compiler implementation of a high-level programming model for managing parallelism in the C6678. This implementation is leveraged to automatically utilize scratchpad memory without additional intervention from the programmer. A high-level construct is also introduced for controlling data placement. An assessment of the performance impact of various memory configurations of the C6678 is also presented.Item A Methodology for Finding Uniform Regions in Spatial Data and its Application to Analyzing the Composition of Cities(2013-08) Cao, Zechun 1986-; Eick, Christoph F.; Vilalta, Ricardo; Forestier, GermainCities all around the world are in constant evolution due to numerous factors, such as fast urbanization and new ways of communication and transportation. However, the evolution of the composition of a city is difficult to follow and analyze. Since understanding the evolution of cities is the key to intelligent urbanization, there is a growing need to develop urban planning and analysis tools to guide the orderly development of cities, as well as to enhance their smooth and beneficial evolution. Urban patches which represent uniform areas of a city play a key role in studying the composition of a city, as different types of urban patches typically are associated with different functions, such as recreational areas and commercial areas. In order to analyze the changes of the composition of cities, a polygon-based spatial clustering and analysis framework for studying urban evolution is proposed in this thesis. A spatial clustering algorithm named CLEVER is used to identify urban patches that are clusters of polygons representing different elements of the city based on a domain expert's notion of uniformity, which has to be captured in a plug-in interestingness function. The analysis methodology uses polygons as models for spatial clusters and histogram-type distribution signatures to describe their characteristics. Finally, popular signatures are introduced that describe distribution characteristics, which occur frequently in contiguous sub-regions of a spatial dataset, and an approach is presented that identifies and annotates urban patches with popular signatures. Experiments on datasets of the city of Strasbourg, France serve as an example to highlight the usefulness of the methodology.Item A Multi-Pronged Approach to Phishing Email Detection(2015-12) Rai, Nirmala 1988-; Verma, Rakesh M.; Mukherjee, Arjun; Bronk, ChrisPhishing emails are a nuisance and a growing threat for the world causing loss of time, effort and money. In this era of online communication and electronic data exchange, every individual connected to the Internet has to face the danger of phishing attacks. Typically, benign-looking emails are used as the attack vectors, which trick users into revealing sensitive information like login credentials, credit-card details, etc. Since every email contains important information in its header, this thesis describes ways of capturing this information for successful classification of phishing emails. Moreover, the phisher has total control over the email body and subject, but little control over the header after the email leaves the sender's domain, unless the phisher is sophisticated and spends a lot of time crafting the attack, which reduces the payoff or may even backfire or yield mixed results. This thesis is a consolidated account of various systems designed to combat phishing emails from different dimensions. The main area of focus is email header. Techniques like n-gram analysis, machine learning and network port scanning are used to extract useful features from the emails. This thesis shows that the classes of features used in these systems are very effective in distinguishing the phishing emails from the legitimate ones. Using different real datasets from varied domains, it highlights the robustness of the methods presented. Some methods, like the header-domain analysis, obtain high detection rates of 99.9% and low false positive rates of 0.1%. These approaches have the advantage and flexibility that they can be easily combined with other existing methods, in addition to being used in standalone mode.Item A Multiscale Model for Breast-Conservative Therapy: Computational Framework and Clinical Validation(2015-08) Simonetti, Valentina 1991-; Garbey, Marc; Bass, Barbara L.; Tsekos, Nikolaos V.Breast cancer is the most common cancer among women worldwide and affects 12% of all the women in the USA. There exist different surgical approaches in order to defeat this kind of cancer: the traditional mastectomy (Breast Removal Surgery) and the more recent Breast-Conservative Therapy (BCT), whose goal is to preserve the breast contour and ameliorate the psycological impact of surgery on the patients. This work aims to exploit the BCT field developing a 3D patient-specific multiscale model that could predict the breast shape after lumpectomy, from surgery to complete healing. This model consists of two parts: a hyperelastic Neo-Hookean Finite-Element Model of the breast tissues and skin, and a Cellular Automata model that mimics the biology of healing after surgery. The resulting multiscale model shows results that agree with our theoretical assumptions and gives as outcome the breast contour after surgery depending on the anatomy of the patient and the input from the surgeon. This work is, in fact, the result of an interdisciplinary collaboration between surgeons, mathematicians and computer scientists. A clinical protocol that involves patients eligible for BCT was developed in order to validate this multiscale model with clinical data. The results obtained show the performance of the model and our findings based on the data of the first patient who took part of the study. The model validation gave us an error of maximum 2.5 cm for the surface comparison, which implies the need of further improvements. The Cellular Automata model showed fairly accurate results with the preliminary data, but we need more patients in order to obtain conclusions that are statistically consistent.Item A New Approach To Domain Adaptation Applied To Supernova Photometric Classification(2016-05) Pampana, Renuka 1990-; Vilalta, Ricardo; Shah, Shishir Kirit; Ishida, EmileSupernova Type Ia plays a vital role in the measurement of the cosmological parameters. It is used as ‘standard candles’ for measuring extragalactic distances. There are other types of supernovae like Supernova Type Ib and Ic that closely resemble Supernova Type Ia (but are not as useful as Supernova Type Ia). Large telescopic surveys capture light curves of these supernovae events referred as photometric observations, which include all the three types. Thus, accurate classification of supernovae from these photometric observations is desirable for proper calculation of cosmological parameters. The existing method for classification of supernova photometric observations is based on spectroscopic method, which is very cumbersome and expensive. In the future, with the increase in photometric surveys, myriad number of supernova photometric observations is expected. Thus, an efficient method for the classification of supernovae is required to replace existing methods. We also want to take advantage of existing dataset classified by spectroscopic method for the classification of upcoming photometric dataset. Since, these two datasets belong to different domains, an adaptive mechanism across the domains is required. Thus, we propose a method to generate a predictive model using domain adaptation with active learning that will classify supernovae (Ia, Ib, Ic) using spectroscopic data (aka source data) as a training set and photometric data (aka target data) as a testing set. Our method includes two concepts of machine learning: 1. Domain adaptation technique is used to transfer the source domain information to the target domain. 2. Active-learning technique is used to rely on only few target domain labels in a non-uniform distribution to build an effective model. The experiments and results show that our method outperforms various domain adaptation techniques with significant increase in classification accuracy.