Browsing by Author "Lindner, Peggy"
Now showing 1 - 20 of 20
- Results Per Page
- Sort Options
Item A Computational Mapping of Online News Deserts on African News Websites(2023-09-28) Madrid-Morales, Dani; Rodríguez-Amat, Joan Ramon; Lindner, PeggyTo date, the study of news deserts, geographic spaces lacking local news and information, has largely focused on countries in the Global North, particularly the United States, and has predominantly been interested in the causes and consequences of the disappearance of local media outlets (e.g., newspapers and TV stations) to the social fabric of a community. In this article, we extend the concept of “news deserts” by drawing on literature on the geography of news in Africa, where information voids have long been documented but have not been studied within the conceptual framework of news deserts. Using computational tools, we analyse a sample of 519,004 news articles published in English or French by news websites in 39 African countries. We offer evidence of the existence of online news deserts at two levels: at a continental level (i.e., some countries/regions are hardly ever covered by online media of other African countries) and at a domestic level (i.e., online news media of a given country seldom cover large areas of the said country). This article contributes to the study of news deserts by (a) examining a continent that has not been featured in previous research, (b) testing a methodological approach that employs computational tools to study news geographies online, and (c) exploring the flexibility of the term and its applicability to different media ecosystems.Item A Parallel Implementation of the Pandas Framework(2020-05) Khan, Saba Hafeez; Gabriel, Edgar; Solorio, Thamar; Lindner, PeggyHigh-performance is a highly desirable trait for applications today. Companies large and small are migrating their serial applications to parallel versions to reduce execution time and increase efficiency. However, preparing serial applications for parallel processing is not a simple process. Pandas, which is a Python library containing rich data structures and tools, is used abundantly in Data Science applications. However, Pandas framework is built for single-core processing and is unable to fully utilize multi-core processors or cluster technology. Because of this limitation, Pandas users are forced to look for other frameworks when working with large quantities of data. This thesis introduces a Parallel-Pandas library which makes the process of parallelizing serial Pandas applications easy and transparent. The Parallel-Pandas library provides Pandas users the ability to upgrade existing applications transparently, by using only a library import. This thesis contains details about the design decisions and implementation of the Parallel-Pandas library. The Parallel-Pandas library is evaluated with unit testing, microbenchmarks, and a real-world application with different datasets. Parallel-Pandas has also been compared with PySpark, a framework that provides parallelism by following the MapReduce structure. The results presented in this paper show that the Parallel-Pandas library has promising potential and delivers performance close to manually parallelized and tuned applications.Item Analyzing Counselor Usage in Texas Elementary Schools(2020-09-29) Abudu, ArielResearch shows establishing communities in schools and providing early intervention clientele, like a counselor, may lead to higher social and emotional learning, and higher collaborative engagement especially at the elementary school level. This study aims to delve more into Texas school elementary communities to create a clearer picture of districts that are supported with counseling staff. Data used was sourced from Texas Education Agency’s - Texas Academic Performance Reports (TAPR) and University of Houston’s College of Education’s - Center for Research, Evaluation and Advancement of Teacher Education (CREATE). Eight districts were chosen to be looked at in Southeast Texas, one district appearing to be at a 50% deficit in total number of counselors compared to the other seven districts. This district also appeared to have the highest number of elementary students considered ‘At-Risk’. At-Risk defined as students who do not meet state assessment standards, are in a residential placement facility, are homeless, have constant absenteeism or are not English proficient. There were no apparent correlations to a school having a higher at-risk student percentage and a school receiving additional counselor support. In addition, there did appear to be a correlation between increased student at-risk count at a school and decreased time with staffed counselors. More data with a larger sample on a campus and district level would be recommended for further iterations of this research to provide more details on other in-school clientele available or needed.Item Association Rule Mining for Risk Assessment in Epidemiology(2016-08) Toti, Giulia 1986-; Vilalta, Ricardo; Lindner, Peggy; Price, Daniel M.; Tsekos, Nikolaos V.In epidemiology, a risk assessment measures the association between exposures and a health outcome. Risk characterization has traditionally been performed using statistical methods such as logistic regression, but such methods are not effective when working with highly correlated variables and when trying to assess synergic actions between exposures. These limitations become evident in studies related to asthma, a common chronic that affects 25 million people in the US. The prevalence of asthma is growing and research is struggling to find the reason. Many factors have been associated with causing and triggering asthma, but their interactions, as well as which one is the most responsible for the spreading of asthma, are still unclear. Outdoor air pollution is on the list of possible causes and triggers. Characterizing the connection between asthma and air pollution is not an easy task, because of high collinearity between pollutant agents, possible synergic actions, and difficulty in controlling the exposure. The research community is currently encouraging the use of multi-pollutant models to yield better results. In this dissertation we propose: (i) a modified Apriori association rule mining method for identification of connections between exposures and risk variations, and (ii) a novel genetic algorithm (GA) designed to mine risk-based quantitative association rules. Both methods were tested on a group of synthetic datasets, and on real data collection about pediatric asthma cases and pollution levels in Houston. The results on the synthetic datasets show the advantages of applying our methods to augment traditional logistic regression, and help determining the best metrics to include in the GA fitness function (odds ratio, length, repetition and redundancy). Tests on clinical data suggest the existence of a correlation between asthma and outdoor air pollutants, both alone and as a mixture. The genetic algorithm improves the results of the Apriori-based method by recognizing what appear to be the most dangerous levels of exposure. Future work will help to improve aspects of the GA such as population initialization or rule selection. To date, the proposed methods represent a significant step in the direction of risk assessment based on association rule mining in epidemiological studies.Item Characterizing Agent Behavior Under Meta Reinforcement Learning With Gridworld(2018-12) Shah, NolanThe capabilities of meta reinforcement learning agents tend to be heavily depend on the complexity and scope of the meta task over which they perform requiring different models, learning algorithms, and strategies to perform well. In this thesis, we show the fragility of agent design and limitations of agents across Gridworld-based meta tasks of increasing complexity. We begin by building a characterization of the complexity of meta tasks within a domain generalization context. We run experiments that demonstrate the ability of agents to perform effectively on meta tasks parameterized with different environmental states, but similar underlying rules. Next, we perform experiments that expose the limitations of those same agents over tasks with different underlying rules, but similar observational spaces. These experiments show that generalization-based strategies succeed with meta tasks that sample from a small scope of base tasks with similar underlying rules, but break beyond that complexity. We also infer from observed agent behaviors that the limitations of agents are attributable to the nature of the model architecture and the meta task design. Furthermore, we run experiments that identify the sensitivity of agent behavior to physical features by augmenting the agent observation size. These experiments show a resilience to limited environmental information, but a lack of spatial awareness to abundant environmental information. Overall, this work provides a baseline for meta reinforcement learning with the Gridworld task and exposes the necessary considerations of agent and environmental design.Item Clustering Individual Entities Based on Common Features(2021-12) Bagheri, Maryam; Prosperetti, Andrea; Lindner, Peggy; Yang, Di; Metcalfe, Ralph W.; Ostilla-Mónico, RodolfoIdentification of clusters in spatial or other datasets is of interest in many applications including epidemiology, medical image processing, landscape ecology, criminology, archeology, astronomy, and many other fields. In the current work, we propose a general method for clustering individual entities on the basis of a common feature for both a two- and three-dimensional spatial region. Specifically, the method is demonstrated on a dataset obtained from the resolved simulation of falling particles in upward-directed fluid flow. These simulations were conducted in a computational domain in the form of a parallelepiped with a square cross-section and aspect ratio of 3. The boundary conditions on all six boundaries enforced periodicity. The particle feature on which clustering is based is the vertical velocity. The clusters identified group particles that have a velocity larger (in modulus) than a specified multiple of the standard deviation of the vertical velocity of all the particles in the domain. The method starts by dividing the region of interest into cells. To capture clusters that extend over several cells use is made of “masks” including many cells. The location and size of the masks are randomly generated and their number is such that each cell of the domain has an approximately equal probability to be covered by a mask. Masks are labeled as interesting if they contain a sufficient number of particles with large velocities. Counting the number of times that each cell has been covered by an interesting mask, each cell is assigned a value that is analogous to the intensity value of an image pixel. By using a global threshold, the region is binarized into high-intensity and low-intensity cells. The high-intensity cells are grouped into clusters by a method that integrates the region growing and region merging methods of digital image processing. The method is shown to work well properly accounting for the spatial periodicity of the data and to be able to track the clusters in time.Item Consequences of Tenure-Clock Extensions for Parents in Academia(2018-12) Alanis, JoseeThis thesis seeks to examine the potential consequences of tenure-clock extensions for parents in academia. I first propose that tenure-clock extensions will be associated with a longer period of time taken for faculty members to be promoted to the rank of full professor. In addition, I also propose that tenure-clock extensions are associated with a decrease in scholarly productivity. With the birth of a child, faculty may take on additional family responsibilities that decrease their availability for scholarly activities, however women are likely to be affected to a greater extent due to their primary role as a caregiver. My study indicated that tenure-clock extensions are not significantly related to time to promotion to full professor, but they are associated with a significant decrease in productivity.Item Digital Project Management Workshop: How to Keep Your Head above Water during a Digital Project(2020-05) Neumann, Kristina; Lindner, PeggyIn this workshop, Dr Kristina Neumann (Department of History, UH College of Liberal Arts and Social Sciences) and Dr Peggy Lindner (Computer Information Systems, UH College of Technology) discuss how to build and manage an evolving digital project and research team, as well as strategies for keeping the lines of communication open between the humanities and STEM.Item Evaluating the Impact of Selective Seam Weld Corrosion on Pipeline Integrity(2023-12-15) Goldberg, Harry L.This thesis examines the impact of Selective Seam Weld Corrosion (SSWC) on the burst pressure of pipelines. This process is caused by factors like galvanic reactions, weld defects, and chemicals like sulfur becoming trapped in the weld material, and results in the longitudinal weld seams being corroded faster than the surrounding pipe material, creating a sharp v-shaped groove that creates a higher concentration of stress. This impact is worsened by the fact that this higher stress is experienced in the Heat Affected Zone, a relatively weak section of material compared to the base metal of the pipe. SSWC is relatively uncommon, seen primarily in ERW and EFW pipes manufactured before 1985, and thus warrants further study this thesis aims to contribute to. This thesis investigates the change in burst pressure when the same pipe sees a varying series of defects, investigating the relationship between this burst pressure and defect dimensions. These results were then compared to values from the Recalibrated PCORRC model, a mathematical model created to predict burst pressure with crack defects. Through this study, it was determined that defect has the largest impact on weakening a pipe's pressure-containing ability, with radius having a smaller but still significant impact. It was also determined that the Recalibrated PCORRC model was most accurate for cracks with a tip radius between approximately 0.7 to 1 mm. For larger radii, these results were further from the FEA results, however, this difference was conservative and predictable. Meanwhile, for radii smaller than 0.7 mm, burst pressures were overstated and inconsistent.Item Evaluation of mpi4py for Natural Language Processing Scenarios(2018-05) Saxena, Manvi 1985-; Gabriel, Edgar; Solorio, Thamar; Lindner, PeggyMany Natural Language Processing (NLP) applications operating on large data sets are written in programming languages that do not have bindings in the Message Passing Interface (MPI) specification. Yet, with increasing problem sizes, these applications also necessitate some form of parallel and distributed processing. The goal of this thesis is to evaluate the utilization of MPI with a non-traditional HPC programing language, Python, for NLP application scenarios. The current thesis is divided into two parts. The first part evaluates the performance and functionality of the mpi4py, a python module for MPI binding, using multiple point-to-point benchmarks with native C-based MPI benchmarks using an InfiniBand and a Gigabit Ethernet network interconnect. The results show that in many instances communication performance of the Python benchmarks was on par with their C-based counterparts. In the second part of the thesis, a few application scenarios used in Natural Language Processing (NLP) such as word count, n-gram count, and tfidf were developed, and mpi4py module was used to distribute data on different nodes for these scenarios and to evaluate performance. The results demonstrate that the application of mpi4py module in NLP scenarios can greatly improve execution time.Item Gender Differences in Job Search and Networking Behaviors Among Scholars(2022-04-29) Barnes, DuBois H.For this study, we sought to determine if gender plays a role in various work-related behaviors. Gender was used as a moderator for all four hypotheses. Analysis software was SPSS. A binary logistic regression was used to analyze three of the hypotheses. Hypothesis 3 had a continuous outcome variable of h-index, so a linear regression analysis was used. For hypothesis 1 and 2, we analyzed the relationship between networking behaviors and job search/job offers. For H3 and H4 we looked at whether increasing networking behaviors increased h-index but more-so for men, and if a high h-index resulted in higher chance of job offer for men than women. No significant results were found for any analyzed interactions. Implications for this study include how length of career and h-index score can increase the likelihood of networking and decrease job seeking. This can be useful for identifying populations interested in changing employment, and identifying populations that may benefit from networking interventions. Additionally, it suggests future research is needed regarding stage in career and employment behaviors. Further research on gender differences in networking and job search is needed.Item Methodological Contributions in Brain-Imaging Genetics: A Review, Simulation Study, and Experimental Analysis(2023-08) Cheek, Connor; Gunaratne, Gemunu H.; Grigorenko, Elena L.; Lindner, Peggy; Morrison, Greg; Ratti, ClaudiaBrain-imaging genetics is focused on understanding the genetic underpinnings of complex cognitive traits using imaging endophenotypes that reflect brain structure or function. This field has seen much methodological development over the past decade, but has not reached a consensus on “best” techniques for analyzing combined genetic, brain-imaging, and behavioral datasets. Historically, a lack of large and diverse datasets has limited method exploration. Recently, after years of collec- tion, new biobanks are releasing large-scale datasets aimed at understanding the biology of complex cognitive traits. Namely, the Child Mind Institute’s Healthy Brain Network (CMI HBN) has re- cently released genotypic, structural brain-imaging, and academic cognitive assessments for up to 4,868 children. We performed a three-part study to analyze this novel dataset. We first present a narrative review that captures the current state of brain-imaging genetic methodology. Next, we select leading methods from the field and compare their performance in a simulated setting. We find that a regularized multiple multivariate regression strategy, the elastic net, is most suited to analyzing simulated brain-imaging genetic data under a range of assumptions. We then apply the elastic net to the CMI HBN dataset exploring complex academic skills. Our analysis identifies sev- eral genes and imaging indicators associated with reading ability. Many of these findings align with previously established associations between the identified genes and reading or similar academic achievement measures. Besides our genetic findings, a key contribution of this work is a flexible and data-driven pipeline for analyzing any brain-imaging genetic datasets like the CMI HBN. Our analytical strategy can serve as a valuable resource for future brain-imaging genetic studies and facilitate the identification of key factors involved in academic skills.Item Modeling Community Health in a Simulated City(2018-10-18) Henini, Manale; Price, Daniel M.; Lindner, PeggyCreating a model of Houston will allow others to further their research on public health questions and policies without running into any legal or privacy violations due to laws like HIPAA. The purpose of SAM City is to use data from the U.S. Census and from the Harris County Appraisal District (HCAD) to create that model of Houston. The model will then be used by other research students to help answer various public health questions. These questions can range from determining what areas of Houston would be affected by a water treatment to researching what areas would benefit most from a diabetes intervention program. The model will be simulated by using a program that manages and analyzes large sets of data (obtained from the U.S. Census and HCAD) and uses data visualization with the programming language R. The program needed to be improved because the runtime was excessively slow, and the model was incomplete. After rewriting the program to use more efficient methods with less repetition, the program's runtime was almost halved and the resulting model was more accurate due to less human error, such as typos. Having an updated and accurate model will enable those who use the model to reach better conclusions and have more confidence in their results, which could then help improve public health policies and regulations in Houston.Item Parallel I/O on Compressed Data Files(2022-04-28) Singh, Siddhesh P.; Subhlok, Jaspal; Gabriel, Edgar; Wu, Panruo; Shah, Shishir Kirit; Lindner, PeggyThe increase in processing power of modern computing hardware has not been accompanied by a proportional increase in the performance of storage technology leading to an imbalance in cluster and parallel computing architectures where input-output (I/O) operations may bottleneck the overall performance of the system. This makes necessary the use of sophisticated software solutions to overcome limitations on I/O performance. One method is to apply specialized algorithms in parallel I/O to optimize data transfer. Another solution to this problem is to use data compression to effectively reduce the amount of data which is transferred between processing and storage units. An under examined area of research is the intersection of parallel I/O and data compression and how these two techniques can be combined in High Performance Computing (HPC) environments. This dissertation presents a general model for incorporating data compression within existing parallel I/O algorithms and evaluates the performance benefits obtained through performing parallel I/O on compressed data files. In particular, the dissertation presents an Open MPI-I/O (OMPIO) implementation which incorporates arbitrary compression libraries within the two phase I/O algorithm through a new file format. The results indicate significant performance and space saving benefits through this approach and the parallel compression semantics presented in this dissertation provide a theoretical basis for future research in parallel I/O and data compression.Item PLAYING WITH THE PAST: THE IMPORTANCE OF HISTORICAL VIDEO GAMES FOR THE FIELD OF HISTORY(2023-06-14) Erickson, Brian T.; Neumann, Kristina M.; Perales, Monica; Lindner, PeggyVideo games have existed for decades, allowing billions of people around the world to explore the digital worlds created by passionate developers. Historical games have always been a part of the medium, though their prominence began to rise with the turn of the century. Historical games like the Civilization franchise, Assassin’s Creed, and many others have brought ancient history into the modern digital age. For many game developers, history became their digital playground. However, even with their importance in the modern cultural zeitgeist, historical video games seemingly do not have a large influence within the scholarly field. Regardless of whether scholars care about video games and historical video games, video games play an enormous role in shaping a wide public’s understanding of the past. Therefore, this thesis explores the importance of historical games within a scholarly context. It examines the role that historians have traditionally played in video game development. Through a series of in-depth interviews, it assesses the reception of these historical games by scholarly and public audiences. Finally, this thesis argues that video games, historical and non-historical, create a new avenue for scholars to engage. They can become an anchor for scholars in the digital era by creating ways to study, educate, preserve, and play with the past.Item Spectral Power Density Analysis of Patients with Primary Progressive Aphasia Using Resting-State Electroencephalography(2023-05-09) Quinn, Christina NikolePrimary Progressive Aphasia (PPA) is a neurodegenerative syndrome with insidious speech and language deficits that gradually worsen as the disorder progresses. Once a general diagnosis of PPA is confirmed, it is further broken down into three variants indicated by the presence or absence of specific speech and language characteristics: nonfluent/agrammatic, semantic, and logopenic. Application of non-invasive neuroimaging techniques can help to confirm diagnosis of PPA and its three variants. The application of resting-state electroencephalography (EEG) to analyze neural oscillations via spectral power density may be more accessible to patients who are otherwise unable to use traditional imaging techniques or who struggle with task-based neuroimaging. Oscillatory slowing, characterized by an increase of relative spectral power density in the low frequency delta and theta bands, along with a reduction in spectral power in the high frequency alpha and beta bands, has been observed in persons with PPA. This study examines relative power spectral density across all three variants of PPA in the delta, theta, delta-theta, alpha, beta, and low gamma frequency bands in eyes open and eyes closed resting-state conditions to see if discernible differences were observed in each variant. The results of this study were similar to findings in previous studies for the logopenic variant, with a significant increase in relative spectral power in the low frequency delta and theta bands and a significant reduction in the high frequency beta band. In contrast to other studies, we did not observe the same decrease in spectral power for the logopenic variant in the alpha band. We did not observe the same increase in spectral power for the low frequency delta and theta bands for nonfluent or semantic variant, nor did we observe a reduction of power in the high frequency bands for these two variants, as has been observed in other studies. The high frequency low gamma band, which previous studies have not studied across the three PPA variants, showed a significant increase in spectral power in semantic variant. Our results suggest that resting-state EEG may prove useful as a biomarker for early and more accurate diagnosis of PPA.Item SYRIOS(2023-04-13) Dempsey, Rowan; Hasan, HaadiSYRIOS is a research organization dedicated to testing the User Experience for online exhibitions. For the past year, we have been learning of Syrian history through coins and sharing any important information to the websites log, where we then meet with "testers" to make sure the site runs smoothly. Testers are individuals from any background - nationality, profession, gender, or age - who have agreed to let us assign them tasks to complete and study how the exhibition performs. SYRIOS is happy to bring Syrian culture to life through this research and allow our testers to help us grow in the humanities field.Item The Trickle-Down Effect of Academic Mentoring(2020-05) Lezcano, AlyssaThe mentoring literature has not sufficiently explored the potential trickle-down effects of mentoring, and there has yet to be an examination of how and why amount of mentoring received might lead a person to mentor a greater number of protégés. This thesis seeks to address these gaps in the literature by examining the role of faculty support systems in promoting greater numbers of mentored students. To accomplish this, I examine career sponsorship as a means to increase number of student protégés through heightened faculty commitment to the mentoring process using a sample of 255 tenured and tenure-track faculty members across 25 public universities in the United States. The results support the proposed hypotheses and indicate that career sponsorship of faculty has a positive indirect effect on number of undergraduate and graduate protégés via increased faculty mentoring commitment.Item The Work-Life Interface and Job Performance(2020-12) Salazar, CheyenneThe work-life interface literature has not adequately examined the objective effects the work-life interface can have for individuals. There has also yet to be an investigation on how an individual’s imbalance or balance between their work-life interface might affect their job performance. This thesis seeks to address these gaps in the literature by exploring the roles of work-family conflict, family-work conflict, work-family balance satisfaction, work-family balance effectiveness, and gender for one’s job performance. In order to accomplish this, I examine these specific facets of the work-life interface and how they affect job performance, in terms of h-indexes, using a sample of 266 tenured and tenure-track faculty members across 25 public universities in the United States. The results support a couple of the proposed hypotheses, indicating that work-family balance satisfaction enhances job performance and that men have better job performance compared to women.Item Towards Microrobot Swarm Path Planning(2019-08) Huang, Li; Becker, Aaron T.; Claydon, Frank J.; Mayerich, David; Nguyen, Hien Van; Lindner, PeggyTiny robots have many promising applications in medical treatment including targeted drug delivery, non-invasive diagnosis, and minimally invasive surgery; and in micro-assembly/ -fabrication for Micro-Electro-Mechanical-Systems (MEMS). Microrobots are often deployed in large populations, and typically steered by uniform driving signals, including magnetic, electromagnetic, electrostatic, optical, gravitational, thermal, and chemical. The homogeneity of the microrobots and the uniformity of the control input make microrobot swarm manipulation difficult in constrained workspaces such as human vascular networks. The control laws and path-planning algorithms designed for macro-size robotics do not scale well to a microrobot swarm, so new methodology must be developed to address more efficient planning with constraints for a multi-agent problem in microscale. This thesis addresses the path-planning problem of a swarm of microrobots using a global control input. It begins with an introduction to state-of-the-art research and applications in microrobots. Chapter 2 gives an analysis of 2D and 3D position control of heterogeneous microrobots in the free space, together with demonstrations in simulations and hardware experiments. Motivated by the need for higher computational efficiency and capability of swarm manipulation with spatial constraints, chapter 3 discusses strategies of planning in 2D vascular networks for a swarm of homogeneous microrobots given a shared, global control input. Multiple path-planning methods and control algorithms are proposed, and their performance is compared in multiple vascular networks with different scale and complexity. The algorithms are validated with simulations and hardware experiments. Chapter 4 investigates reinforcement learning strategies to further improve path-planning efficiency, and to overcome local minima dilemmas in online algorithms. Chapter 5 reports automatic steering methods in multi-bifurcation vessels with flow, and reinforcement learning algorithms are implemented for improvement in microrobot delivery rate.