Hoan Tran

Introduction:
My name is Hoan Tran and I am a graduate from the University of California, Irvine majoring in Computer Science. My interests in the field of Computer Science include AI/Machine Learning as well as Computer Vision
These are some of the projects I have done so far that I felt were worth sharing. Many revolve around AI/Machine Learning concepts, but I also have experience with other types of projects such as some Full Stack Development and algorithm testing.

Unreal Projects:

Unreal Plugins:

Customm HLSL Editor Plugin:

Wumpus World in Minecraft Project Website

Summary:
While trying to use Unreal Engine as a tool to learn more about games and graphics, I found trying to learn shader programming natively within Unreal was not very accessible. There exists a node called “Custom” inside the UE5 shader graph that can allow the user to add HLSL code into the shader graph, but its usability and functionality is cumbersome. The node allows inputs and outputs that can then be wired into the rest of the shader graph so that the user can mix regular shader graph nodes and the HLSL code. For the most part, it is perfectly functional at allowing HLSL code to be added to the shader graph, but the main issues stem from its usability and UI implementation.
Currently, the code needs to be added to a text field off to the side of the screen and the inputs and outputs need to be manually added individually and must match the HLSL variable they feed into within the code. Most workflows and tutorials implementing the Custom node involve using a third party text editor / IDE to author the HLSL code and then copying and pasting the code text into the Code field of the Custom node.

This plugin will allow the user to open up a window to edit that Custom Node's "Code" field from within the Unreal Editor without having to open up a separate IDE or try to squint at the small text field to the side of the screen. Multiple different Custom nodes can be opened together as different tabs within the window and the names of those tabs correspond to the "Description" field for the Custom node. Both the tabs and the window behave like other Unreal tabs in that they can be freely resized and moved as well as docked within the editor to allow flexibility in how the user wants to interface with it.

Game Prototypes:

Beholder:

Summary:
One of my first group projects in Unreal which was a prototype for a Resident evil survival horror type game. Our group consisted mainly of different programmers and engineers so we each had our own share of programming and blueprint mechanics to implement on top of various other responsibilities such as sound and art. My other non-programming responsibilities were centered around creating or obtaining different art assets such as meshes, textures, and post process effects.Some of the main mechanics I implemented include the Interact button for the character, the different interaction states for various objects, the enemy's "attack" and character damage state, as well as a special vision state that lets the player see different things not typically visible.

Group Projects:

AI/Machine Learning:

Wumpus World AI in Minecraft:

Wumpus World in Minecraft Project Website

Summary:
Our group decided to try and create the Wumpus World AI within Minecraft using Microsoft's Project Malmo library. Using Malmo, we designed the Wumpus World game and its rules within Minecraft. We made it so that our agent would run on randomly generated worlds with different sizes as well as a varying number of pits. Each block of the map represents a different cell in the Wumpus World and type of block the agent stands on determines what it senses in the area. The agent will blindly traverse the map using a search algorithm so that it can backtrack once it finds an obstacle or a dead end. Towards the end of the project we started experimenting with different search algorithms starting with a DFS algorithm, then DFS with Djikstra's for the backtracking, A* without a heuristic, and A* with a heuristic. The heuristic used for A* was based on the number of nearby blocks that have been explored within a certain radius from a given block.

NBA Game Predictions with Regression:

NBA Game Predictions Repo

Summary:
Our group wanted to try and predict a given matchup between two NBA teams that have been playing in the season. To do this, our group first chose the features of the games that wil be used to predict who would win. We decided to use the "Four Factors" as described by Dean Oliver to be our data. For each game we compiled the data as the team's Effective Field Goal Percentage, Turnover Percentage, Offensive Rebound Percentage, and Free Throw Percentage. These four attributes would then be plotted as a function of their opponents Defensive Rating calculated by Basketball Reference where we got our data from. We then split the data for each attribute into Testing and Training data to generate Polynomial Regression Lines and chose the degree with the least error using the Regression function to predict their "next" game using that opponent's average Defensive Rating for the season up to that point. We then use Dean Oliver's percentage breakdown to calculate a weighted value to see who's weighted value is higher and by how much. We chose a concluded season from 2010 to make sure our approach to the problem was plausible as a predicter.

Colon Polyp Detection with Deep Learning/Neural Networks:

Colon Polyp NN Repo

Original Image — Left: Original Colonoscopy Image, Right: Binary Mask for Segmenting the Polyp

Binary Mask to Segment Polyps — Left: Original Colonoscopy Image, Right: Binary Mask for Segmenting the Polyp

Colonoscopy Sample Images: Click to show

Summary:
Our group decided to look into how machine learning and neural networks can help in the medical field. One of the most prominent use cases is image recognition and image processing for medical images. In this case we decided to use colonoscopy image and video data since we found a free data set to use as well as a data set provided by a professor and graduate student. We went through the process of annotating around 600 images total and combining it with about 600 pre-annotated images from a public database. Since there were still a relatively small amount for training a neural network, we also performed some image pre-processing such as some image rotations, flips, and zooming in or out to generate some more varied image data. We then implemented the U-Net architecture because there has already been good studies into its effectiveness, especially for doing image segmentation on medical imaging data. To speed up training, we also modified parts of it to hold the VGG16 weights from ImageNet which trains on thousands of images so that similar object detection layers at the beginning of the network do not need to be retrained that much. Our metrics for accuracy were the Dice Coefficient and Intersection over Union which are two common ways for evaluating image simlilarity or in this case, similarities between the training data and the predicted image segmentation mask. The model was then trained at around 50 epochs with early stopping and a batch size of 4 since our GPUs were not the best. Overall, we got a respectable .844 accuracy using the Dice Coefficient on the test data, but there is still room for improvement such as having a bigger training dataset by having more images or better image pre-processing as well as possibly better training parameters. Furthermore, if we were to use this as an application, it would also likely be used on a video feed rather than separate still images so as to detect polyps in real time during a colonoscopy.

Public Database:
Colonoscopy Public Database

Webpage Indexing and Search Engine

Webpage Indexing and Searching Repo

Summary :
This group project is focused on creating one of the early versions of a search engine using a collection of scraped webpages connected to the University of California, Irvine website. We focus the scope to just UCI so that testing is still reasonable while still being decently large enough to get a feel for how well it runs at scale. The project stakes an inverted index of webpages and performs a search on them. The inverted index maps tokens to a list of websites with the TF-IDF values for each one. TF means term frequency and refers to how many times that token appears on the webpage and IDF refers to Inverted Document Frequnecy which refers to how many unique webpages the token appears in and makes it inverted (1 / DF). TF-IDF then becomes TF * IDF to be stored for each [word,webpage] pair. The search engine will then take this inverted index and with a search query calculate the cosine similarity between the query tokens and which webpages from the inverted index have the smallest cosine similarity. Cosine similarity takes the TF-IDF values for the query and each webpage for each matching token as a vector and calculates how close the angle between the vectors are.

Solo Projects:

Faux Web Service: Full Stack

Root Repo

Summary :
This project was about trying to do full stack development for a webservice that mimics what the old Netflix used to do. It is coded using Java and MySQL as the backend with services that provide account management (register, log in, token authentications, etc.), browsing for movies to purchase (database lookup), and implementing the PayPal API for billing (as if it was a real service for purchasing movies). The frontend was developed using React and JavaScript to design a single-page application so that browsing for movies and the multiple database lookups that occur for account authentications can happen dynamically. Of course, the actual design and format of the webpage also uses CSS and HTML in conjunction with the JavaScript. Unfortunately, there isn't really a way for me to host the entire webservice so the raw files are just stored in a github repository for now.

Experimental Analysis Projects:

Root Repo

Summary :
This section is less of an actual project, and more of a showcase for understanding various algorithms and the concept of scalability and run-time analysis. I use some small test scripts for each algorithm to record different aspects such as run time, memory, or correctness depending on which algorithm. Besides just analyzing theoretical big-O estimations, the scripts run actual test cases of varying sizes to show how each algorithm performs.

Analysis of Sorting Algorithms:

Sorting Analysis Repo

Summary :
Here I tested various sorting algorithms and while many of them are already known for there big-O runtime approximations from study and analysis, testing and plotting the effects of higher runtime degrees on these known algorithms can serve as a good reference for how it scales in practice. The sorting algorithms include bubble sort, annealing sort, insertion sort, and shell sort. They were mostly chosen for their very different approaches to sorting rather than trying to see which one is the absolute best since it is already pretty well-known how well these algorithms run.

Analysis of Bin Packing Alogrithms:

Bin Packing Analysis Repo

Summary :
This project test for how well bin packing algorithms run. For humans, it is really easy to determine how to organize bins such that there is no excess space. However, for algorithms it is almost impossible to perfectly replicate that type of bin packing without some form of prior knowledge or info. The different algorithms are the Next Fit algorithm, First Fit, Best Fit, First Fit Decreasing, and Best Fit Decreasing. The names refer to what bin items should be placed into bins such as the First Fit algorithm picking the first bin that can fit the current item. It keeps track of how much is in each bin and chooses the first one it can find and allocates new bins when no current bin can hold it. The "Decreasing" variants are the same except the items are sorted first in decreasing order. For the most part, there is not an overtly best algorithm, but the Next Fit algorithm is clearly the worst since it does not really care which bin it chooses to place items as long as it fits.

Analysis of Network Models and Algorithms:

Network Model and Algorithm Analysis Repo

Summary :
This project explores how to traverse networks or graphs and the properties of the Barabasi-Albert model of networks. The properties explored are the diameter of a graph, the clustering coefficient, and the degree distribution of the graph. The Barabsi-Albert model generates connections according to how many connections a certain node already has. In other words, nodes more connections are more likely to generate connections to other nodes which is also known as "Preferential Attachment." The diameter of a graph is the longest, shortest-path between all node pairs. The cluster coefficient refers to how much nodes tend to form groups of connections and can be calculated with the number of triangle connections and dividing by the number of 2-edge paths. The degree distribution is the distribution of the nodes' neighbor counts.