Undergraduate Research Day Projects

Permanent URI for this communityhttps://hdl.handle.net/10657/2212

Organized by the University of Houston Office of Undergraduate Research and Major Awards, Undergraduate Research Day is an annual event showcasing exceptional scholarship undertaken by the UH undergraduate community.

Browse

Now showing 1 - 16 of 16

A Question Selection Strategy for Early Warning Systems
(2018-10-18) Zeng, Victor
Early warning systems, or early alert systems, are systems to identify students at risk of failing a course. These systems use two categories of indicators: Traditional indicators such as assignment grades and class attendance, and “soft” factors such as the student’s behavior and learning network. Naturally, in the interest of preserving user engagement, an early warning system should ask the least amount of questions possible. In this research, we seek to determine if it is possible for an academic early warning system to obtain a level of prediction accuracy from an incomplete data set like that which can be obtained from a complete data set. A set of questions is developed about the student’s study habits, study attitudes, study anxiety, time management, learning network, and class participation. The questions answers are used to identify student’s characteristics. The pilot study is based on previous sessions of the Data Structures course at the University of Houston. First, classifiers are constructed based on two different algorithms k-nearest neighbors and feed-forward-neural network algorithms. Then training datasets of assignment and exams grades are measured using the three-fold cross validation method. In the future, we plan on implementing the study by asking the students in the upcoming fall session of Data Structures these questions and perform a mutual information analysis of their responses. If there is a high level of mutual information we will perform offline experiments on the data set to explore a mutual information approach and a PCA based approach to select optimal subsets of questions to ask individual students.
Analyzing Errors of Neural Models in Named Entity Recognition
(2020-09-29) Parikh, Dwija
Despite stellar performance on many NLP tasks, the behavior of neural models like BERT is not properly understood. We attempt to analyze the behavior and recognize patterns in errors for the NER task. We evaluate the predictions and errors generated to gain insight into the model's behavior Our findings show that there are underlying patterns leading to unintended memorization. Future research is required to address these errors and fine-tune the model.
Can a more comprehensive index be formed to identify areas of low access to food sources in Harris County?
(2023-04-13) Harrison, Noah
The United States Department of Agriculture has four definitions for which census tracts in the United States can be flagged as “food deserts,” which can be found in the 2019 Food Access Research Atlas. Such methodology has several potentially detrimental limitations that can be addressed. By collecting adequate data, another index can be formed to gauge the level of access to healthy foods in each census tract.
Decoding Administrative Data for Process of Care in Cancer Treatment
(2022-04-14) Parikh, Dwija
Past decades in cancer treatment have shown a drastic decrease in mortality rates due to advancements in diagnosis and treatment. However, cancer care remains extremely complicated since multiple diagnoses are frequent, and it takes numerous tests to narrow specific tumors. This study aims to develop evidence-based tools that providers can use to make decisions about patient care.
Development of Integrated Search, VideoPoints
(2021-04-01) Trinh, Viet
Abstract: During the last few years, an advanced lecture video portal called VideoPoints has been developed at the University of Houston to utilize recorded lecture videos to support classrooms. This web portal focuses on helping students locate their content of interest as quickly and precisely as possible. Some of the important features of VideoPoints are: Content-based Indexing: dividing lecture video into a sequence of cohesive video segments, each covering a subtopic in the presentation. Video Summarization: generating keywords and images that summarize each video segment.Search: find all video segments that contain the keywords of interest.The short-term purpose of this project is to add the ability to search in the audio captions (transcripts), considering the importance of verbal explanation. The long-term purpose is to improve the Video Summarization feature with important keywords rated by the number of occurrences of them in both speech and slides. Currently, the web portal fetches both video contents and audio captions for all keyword occurrences. Hence, we can quickly allocate all the segments that contain the keywords of interest. Then, we developed a ranking system to rate the mixed search result from both video segments and audio captions based on the frequency and the importance of keywords. The result is then displayed on top of each video segment thumbnail as a rating bar, which helps students find their desired content easier. Finally, we also refined the key suggestion based on the frequency and the user’s searching history.
Entropy-based scheduling performance in real-time multiprocessor systems
(2021-04-01) Rivas, Daniel E.
In this senior research project, we present the performance analysis of the entropy-based scheduling approach in real-time multiprocessor systems. We analyze the effect of using the entropy-based scheduling layer in deadline-based (global EDF), laxity-based (LLF), and PFair-based (PD2) scheduling algorithms by measuring the number of preemptions, the number of job migrations, and the number of task migrations. The performance comparison results between the selected scheduling algorithms with their entropy-based versions showed that the entropy layer reduces the number of task migrations for all studied algorithms and reduces the number of job migrations for LLF and PD2.
Evaluation of Features and Clustering Algorithms for Malware
(2018-10-18) Faridi, Houtan
Malware undoubtedly have become a major threat in modern society and their numbers are growing daily. The internet today is increasingly used by highly skilled malware developers and has even become home to large black markets for the purchase and spreading of malware. This provides a strong incentive for the malware developers to decrease the chances of being detected by anti-virus programs. By using different obfuscation techniques, authors can ensure other versions of their malware continue to function if a signature is developed for another. This leads to multiple new implementations of the same type of malicious software that can propagate out of control. Approximately, about 400,000 new malware are being registered every day which gives rise to the problem of processing the huge amount of unstructured data obtained from malware analysis. This also makes it challenging for anti-virus vendors to detect zero-day attacks and release updates in a reasonable time-frame to prevent infection and propagation. Hence, to ensure that a large number of malware is analyzed and understood, a possible technique is to cluster them into groups of malware that have similar characteristics. These groupings can help in visualizing relationships between malware and their evolution over time, construct automatic signatures for entire groupings of malware instead of individually, and even help in the detection of zero-day malware. By extracting data via dynamic analysis, we test several combinations of features to generate clusters using multiple different mechanisms combined with a host of different similarity measures and analyze the results.
Image Classification of Dewetting Microscopy Using Artificial Neural Networks
(2018-10-18) Sutrisno, Raymond
Contemporary methods to analyze dewetting stages from optical microscopy are limited to manual classification. The project seeks to automate this process by using image processing techniques and machine learning. Magnitude independent features, such as pixel skew, variance, and entropy, along with their local deviations, were used to train a simple feed forward neural network. From a dataset of 64 images, tuning was achieved by selecting the neural network hyperparameter configuration with the highest peak cross validation score. The selected model accurately classified approximately 80% of the testing set.
Increasing Student Retention in the Course of Data Structure Through the Implementation of Data Mining
(2017-10-12) Tian, Yilei
With its strong commitment in supporting students to complete a high quality education within four years, this research wishes to extend this commitment and shed light on improving students' performances in class. In doing so, it is going to use the data collected from the course "Data Structure" to examine students' sense of their own achievement, in order to better understand their learning outcomes. It aims to propose a framework to predict individual's class performance, so instructors can take early actions and provide students with the help needed to succeed. This research shows that despite the exam difficulties, students have a general idea how well they will perform on the exam. This means the likelihood for un-prepared students to have good fortune and receive exceptionally good grades is extremely low. Thus, low self-assessment responses give out the first signs of dropping and failing, and students should trust their intuitions and hold themselves responsible for seeking help. On the other hand, students with high self-assessment responses and an accurate sense of how well they will perform, tend to have stable performance over time.
LGBTQ Overrepresentation, Experience, and Mental Health Outcomes in Federal and State Prisons by State
(2022-04-14) Higdon, Jacob
The LGBTQ community is overrepresented in most, if not all, components of the U.S. criminal justice system, including sentencing, incarceration, and solitary confinement. Using the Bureau of Justice Statistics' Annual Survey of Prison Inmates (hereafter SoPI), this study examined patterns of over-representation, potential different treatment by prison staff, and mental health outcomes for LGBTQ people sampled by the survey across various states. By ranking states on these measures, this study assesses possible relationships to key socioeconomic factors. This strategy identified LGBTQ overrepresentation and poor treatment in U.S. federal and state prisons and highlights some ways in which the SoPI is lacking vital information to examine aspects of this issue. It therefore suggests that in the future different sampling methods or larger sample sizes may be needed to adequately assess this subpopulation.
Localizing an RF Transmitter using Received Signal Strength
(2021-04-01) Taylor, Conlan; Yannuzzi, Michael; Garcia, Javier
This poster details work conducted by student researchers in the Electrical and Computer Engineering Robotic Department's Swarm Control Lab. Localization of RF transmitters has been attempted using various means and computational algorithms to achieve the most accurate estimates of a source's location in noisy systems. We examined a method proposed by F. Koo and J. Cha to test these algorithms using the received signal strength at multiple receiver locations. The theory behind Koo and Cha's method was found to be compatible with our desired hardware and application in aquatic sensory drift nodes. Preliminary measurements and simulations yielded positive results, further supporting the adoption of this method in RF localization.
Mapping Advanced Coursework: The Role of School Attendance Boundaries and Income Segregation
(2022-04-14) Wolken, Hannah
School Attendance Boundaries (SABs), the geographical boundaries that determine which school a student attends, can perpetuate income segregation in public school campuses by dividing households into one school or another. The goal of this study is to quantify the impact of school attendance boundaries on access to advanced coursework in the school districts of the Greater Houston area. Data was used from the TEA and CREATE for insight into advanced coursework, the Census for values pertaining to wealth distribution in the Houston area, and the NESC for the geographical data necessary for the School Attendance Boundaries. Through the results of this study, there was a significant relationship between both the high school campuses in the greater Houston area which have high median household income and advanced coursework access, as well as a relationship between higher levels of advanced coursework and income segregation. However, the irregularity of school attendance boundaries showed to have little correlation with income segregation.
Uncovering Relationships Between Climate Change and Quality of Life
(2022-04-14) Broekhuis, Thomas
The creation of a novel index to measure climate harshness and the application of this index to locate correlations between changes in climate and changes in quality of life.
Using Clustering Techniques to Classify Self-Efficacy of Women in Computer Science
(2020-09-29) Tuy, Pichvyda
The percentage of women majoring in computing fields has fallen dramatically in the past decades, from 35%-37% in the 1980s to about 20% percent in the present. In addition to the low enrollment rate, there is also a large number of female CS students who are dropping out or considering switching their majors. Multiple studies have found self-efficacy to be one of the main barriers that is causing this retention problem. The purpose of this study is to investigate the effects of academic standing, programming experience, and skill level on self-efficacy of female students at the University of Houston, and classify their self-efficacy levels based on their assigned clusters. Clustering models are created using two clustering algorithms: k-means clustering and hierarchical clustering. Finally, we will evaluate the classification accuracy of the model by comparing its outputs with the true outputs obtained from survey results. Results on clustering models show that k-means clustering is superior to hierarchical clustering in terms of cluster analysis and cluster interpretation. Clusters produced by k-means show a linear positive relationship between academic standing, programming experience, and skill level. Evaluation of classification accuracy will be carried out as part of our future work.
Using K-Nearest Neighbors to Classify Undergraduate Female Self-Efficacy in Computer Science
(2021-04-01) Farooque, Aisha
Since the introduction of new curriculum standards in high school, the field of Computer Science is increasing in interest amongst incoming first-year undergraduate students. However, student retention rates, especially female undergraduate students, in Computer Science, are among the lowest among all STEM majors. Therefore, this research aims to assess the relationship between Computer Science and programming self-efficacy among female STEM major and minor students. We will use the results to help in the development of supplemental resources for undergraduate female students. Throughout this study, the information will be collected to develop a classification-based machine learning algorithm. A focus group will be conducted to gather more input from the students on their Computer Science educational experience. Furthermore, we will use the results to help develop supplemental resources for undergraduate female students. The findings will be used to investigate and improve areas of concern for female undergraduate students. Since self-efficacy is a product of self-belief and engagement, this supplemental support will help stakeholders such as instructors, universities, and companies to generate suitable strategies to address the issue and support female Computer Science undergraduate students in their journey to become computing professionals.
Using K-Nearest Neighbors to Classify Undergraduate Female Self-Efficacy in Computer Science
(2020-09-29) Farooque, Aisha
Since the introduction of new curriculum standards in high school, the field of computer science is increasing in interest amongst incoming first-year undergraduate students. However, student retention rates, especially female undergraduate students, in computer science, are among the lowest among all STEM majors. Therefore, this research aims to assess the relationship between computer science and programming self-efficacy among female STEM major and minor students. We will use the results to help in the development of supplemental resources for undergraduate female students. Throughout this study, the information will be collected to develop a classification-based machine learning algorithm. A focus group will be conducted to gather more input from the students on their computer science educational experience. Furthermore, we will use the results to help develop supplemental resources for undergraduate female students. The findings will be used to investigate and improve areas of concern for female undergraduate students. Since self-efficacy is a product of self-belief and engagement, this supplemental support will help stakeholders such as instructors, universities, and companies to generate suitable strategies to address the issue and support female computer science undergraduate students in their journey to become computing professionals.

Browse

Browsing Undergraduate Research Day Projects by Department "Computer Science, Department of"