Introduction:
My name is Hoan Tran and I am a graduate from the University of California, Irvine majoring in Computer Science. My interests in the field of Computer Science
include AI/Machine Learning as well as Computer Vision
These are some of the projects I have done so far that I felt were worth sharing. Many revolve around AI/Machine Learning concepts, but I also have experience with
other types of projects such as some Full Stack Development and algorithm testing.
Summary:
While trying to use Unreal Engine as a tool to learn more about games and graphics, I found trying to learn shader programming natively within Unreal was not very accessible.
There exists a node called “Custom” inside the UE5 shader graph that can allow the user to add HLSL code into the shader graph, but its usability and functionality is
cumbersome. The node allows inputs and outputs that can then be wired into the rest of the shader graph so that the user can mix regular shader graph nodes and the
HLSL code. For the most part, it is perfectly functional at allowing HLSL code to be added to the shader graph, but the main issues stem from its usability and UI implementation.
Currently, the code needs to be added to a text field off to the side of the screen and the inputs and outputs need to be manually added individually and must match the HLSL variable
they feed into within the code. Most workflows and tutorials implementing the Custom node involve using a third party text editor / IDE to author the HLSL code and then copying and
pasting the code text into the Code field of the Custom node.
This plugin will allow the user to open up a window to edit that Custom Node's "Code" field from within the Unreal Editor without having to open up a separate IDE or try to squint at the
small text field to the side of the screen. Multiple different Custom nodes can be opened together as different tabs within the window and the names of those tabs correspond to the "Description"
field for the Custom node. Both the tabs and the window behave like other Unreal tabs in that they can be freely resized and moved as well as docked within the editor to allow flexibility in how
the user wants to interface with it.
Summary:
One of my first group projects in Unreal which was a prototype for a Resident evil survival horror type game. Our group consisted mainly of different programmers and engineers so we each had our own
share of programming and blueprint mechanics to implement on top of various other responsibilities such as sound and art. My other non-programming responsibilities were centered around creating or
obtaining different art assets such as meshes, textures, and post process effects.Some of the main mechanics I implemented include the Interact button for the character, the different interaction
states for various objects, the enemy's "attack" and character damage state, as well as a special vision state that lets the player see different things not typically visible.
Summary:
Our group decided to try and create the Wumpus World AI within Minecraft using Microsoft's Project Malmo library. Using Malmo, we designed the
Wumpus World game and its rules within Minecraft. We made it so that our agent would run on randomly generated worlds with different sizes as
well as a varying number of pits. Each block of the map represents a different cell in the Wumpus World and type of block the agent stands on
determines what it senses in the area. The agent will blindly traverse the map using a search algorithm so that it can backtrack once it finds
an obstacle or a dead end. Towards the end of the project we started experimenting with different search algorithms starting with a DFS algorithm,
then DFS with Djikstra's for the backtracking, A* without a heuristic, and A* with a heuristic. The heuristic used for A* was based on the number of
nearby blocks that have been explored within a certain radius from a given block.
Summary:
Our group wanted to try and predict a given matchup between two NBA teams that have been playing in the season. To do this, our group first chose
the features of the games that wil be used to predict who would win. We decided to use the "Four Factors" as described by Dean Oliver to be our
data. For each game we compiled the data as the team's Effective Field Goal Percentage, Turnover Percentage, Offensive Rebound Percentage, and
Free Throw Percentage. These four attributes would then be plotted as a function of their opponents Defensive Rating calculated by Basketball
Reference where we got our data from. We then split the data for each attribute into Testing and Training data to generate Polynomial Regression Lines
and chose the degree with the least error using the Regression function to predict their "next" game using that opponent's average Defensive Rating for
the season up to that point. We then use Dean Oliver's percentage breakdown to calculate a weighted value to see who's weighted value is higher and by
how much. We chose a concluded season from 2010 to make sure our approach to the problem was plausible as a predicter.
Summary:
Our group decided to look into how machine learning and neural networks can help in the medical field. One of the most prominent use cases is image
recognition and image processing for medical images. In this case we decided to use colonoscopy image and video data since we found a free data set to use
as well as a data set provided by a professor and graduate student. We went through the process of annotating around 600 images total and combining it with
about 600 pre-annotated images from a public database. Since there were still a relatively small amount for training a neural network, we also performed some
image pre-processing such as some image rotations, flips, and zooming in or out to generate some more varied image data. We then implemented the U-Net
architecture because there has already been good studies into its effectiveness, especially for doing image segmentation on medical imaging data. To speed up
training, we also modified parts of it to hold the VGG16 weights from ImageNet which trains on thousands of images so that similar object detection layers
at the beginning of the network do not need to be retrained that much. Our metrics for accuracy were the Dice Coefficient and Intersection over Union which
are two common ways for evaluating image simlilarity or in this case, similarities between the training data and the predicted image segmentation mask. The model
was then trained at around 50 epochs with early stopping and a batch size of 4 since our GPUs were not the best. Overall, we got a respectable .844 accuracy
using the Dice Coefficient on the test data, but there is still room for improvement such as having a bigger training dataset by having more images or better
image pre-processing as well as possibly better training parameters. Furthermore, if we were to use this as an application, it would also likely be used on a
video feed rather than separate still images so as to detect polyps in real time during a colonoscopy.
Public Database:
Colonoscopy Public Database
Summary :
This group project is focused on creating one of the early versions of a search engine using a collection of scraped webpages connected to the University of
California, Irvine website. We focus the scope to just UCI so that testing is still reasonable while still being decently large enough to get a feel for
how well it runs at scale. The project stakes an inverted index of webpages and performs a search on them. The inverted index maps tokens to a list of websites
with the TF-IDF values for each one. TF means term frequency and refers to how many times that token appears on the webpage and IDF refers to Inverted Document
Frequnecy which refers to how many unique webpages the token appears in and makes it inverted (1 / DF). TF-IDF then becomes TF * IDF to be stored for each
[word,webpage] pair. The search engine will then take this inverted index and with a search query calculate the cosine similarity between the query tokens
and which webpages from the inverted index have the smallest cosine similarity. Cosine similarity takes the TF-IDF values for the query and each webpage for each
matching token as a vector and calculates how close the angle between the vectors are.
Summary :
This project was about trying to do full stack development for a webservice that mimics what the old Netflix used to do. It is coded using Java and MySQL as the
backend with services that provide account management (register, log in, token authentications, etc.), browsing for movies to purchase (database lookup), and
implementing the PayPal API for billing (as if it was a real service for purchasing movies). The frontend was developed using React and JavaScript to design a
single-page application so that browsing for movies and the multiple database lookups that occur for account authentications can happen dynamically. Of course,
the actual design and format of the webpage also uses CSS and HTML in conjunction with the JavaScript. Unfortunately, there isn't really a way for me to host
the entire webservice so the raw files are just stored in a github repository for now.
Summary :
This section is less of an actual project, and more of a showcase for understanding various algorithms and the concept of scalability and run-time analysis. I use
some small test scripts for each algorithm to record different aspects such as run time, memory, or correctness depending on which algorithm. Besides just analyzing
theoretical big-O estimations, the scripts run actual test cases of varying sizes to show how each algorithm performs.
Summary :
Here I tested various sorting algorithms and while many of them are already known for there big-O runtime approximations from study and analysis, testing and plotting
the effects of higher runtime degrees on these known algorithms can serve as a good reference for how it scales in practice. The sorting algorithms include bubble sort,
annealing sort, insertion sort, and shell sort. They were mostly chosen for their very different approaches to sorting rather than trying to see which one is the absolute
best since it is already pretty well-known how well these algorithms run.
Summary :
This project test for how well bin packing algorithms run. For humans, it is really easy to determine how to organize bins such that there is no excess space. However, for
algorithms it is almost impossible to perfectly replicate that type of bin packing without some form of prior knowledge or info. The different algorithms are the Next Fit
algorithm, First Fit, Best Fit, First Fit Decreasing, and Best Fit Decreasing. The names refer to what bin items should be placed into bins such as the First Fit algorithm
picking the first bin that can fit the current item. It keeps track of how much is in each bin and chooses the first one it can find and allocates new bins when no current
bin can hold it. The "Decreasing" variants are the same except the items are sorted first in decreasing order. For the most part, there is not an overtly best algorithm,
but the Next Fit algorithm is clearly the worst since it does not really care which bin it chooses to place items as long as it fits.
Summary :
This project explores how to traverse networks or graphs and the properties of the Barabasi-Albert model of networks. The properties explored are the diameter of a graph,
the clustering coefficient, and the degree distribution of the graph. The Barabsi-Albert model generates connections according to how many connections a certain node already
has. In other words, nodes more connections are more likely to generate connections to other nodes which is also known as "Preferential Attachment." The diameter of a graph
is the longest, shortest-path between all node pairs. The cluster coefficient refers to how much nodes tend to form groups of connections and can be calculated with the number
of triangle connections and dividing by the number of 2-edge paths. The degree distribution is the distribution of the nodes' neighbor counts.