05 Oct 2025 7 min read Thoughts

My Thoughts on AI

I'm excited for a tool that can help with some of the harder language-based problems but like most things I fear that the technology in the wrong hands will lead to greater distrust of the systems or utilization in manners that cause more distrust to be sewn into our world in time.

An AI Generated image by Gemini based on its questions about my thoughts on AI (chat log here - https://gemini.google.com/share/039e94e78880)

I've been in the AI sphere for more than a decade now after spending a portion of my Master's Degree on Machine Learning methods in Healthcare Data. When I was in school, the term Artificial Intelligence was kind of the umbrella of Machine Learning, Knowledge Discovery, Data Analytics, Data Science, and more terms of the like, but I focused my studies specifically on the Machine Learning aspect (specifically with different Machine Learning techniques to train an ML model for data with known outputs).

All of this to say, AI isn't "new". AI isn't generally a new thing that hasn't been out before. There's just been a more prevalent abundance of AI being injected into everything we do. Think about the Facebook, back when I joined Facebook in 2006 – I was able to get an account because I was taking classes at Mizzou while still in High School – you typically had to have an e-mail address or know a person's name in order to add them as a "Friend". This was mostly happening this way, because email is typically how you communicated with people. At the time "text messages" were bought at the price of $0.10 per sent or received message, your "plans" were just getting to "unlimited calling" where they USED to be just unlimited nights and weekends. Halfway through my senior year of high school, Facebook opened up to "High School Students" as well (for a select group of them) and then I was able to look for some of the high school friends I had who were on Facebook (I still have no clue how they got added initially).

I bring all of this up, because once a flood of people started to join Facebook, they started to have recommender algorithms to recommend you friends. How this worked was by using "AI" behind the scenes. Recommendation systems have been around for a while, but common places where these existed were in places where discovery are what power future traffic. Facebook is the one I remember interacting with first, but the likes of Netflix and Amazon are also top thoughts as their business as far more focused on online use cases where you're actually feeding it your data directly. Netflix aligned its recommendations based on your ratings of movies (and knowledge of movies that you rented), Amazon based on your orders, Facebook based on the friends you were presented and then added as a friend (and they added you back).

AI has been present in the world for a long time, especially if you've used social media or any form of online content delivery (YouTube, Netflix, Amazon, etc.). This data has been farmed from you in one way or another and things tuned specifically for your liking and morphed by the underlying "algorithms" that have helped recommend you the "next video" or a "potential friend" or even "a new product to try". One of the things that AI has not done well for a long time is handle written language very well. Anyone who spent time om Machine Learning things in the 2000's and early 2010's knew that when it came to machine learning use cases, natural language processing was one of those really difficult items. Most of the natural language processing (NLP) was focused early on on identifying parts of speech in order to better understand where to extract things like action words from a request, etc. NLP had improved a lot from when I first looked into it (in 2012) to when I was working in the industry when a lot of the natural language processing really relied on well-curated datasets to help with proper identification. I recall in 2013 doing classification on tweets based on a small training/test set that had human-curated identification of tweets with and without the use of "medical terms" from a previously parsed set for a class project. My group had done self-identification of a small subset of the tweets in order to build a machine learning model that could validate the remaining tweets that were available (I think there were over 30,000 available and we had done a set of about 300). We had used some string tokenization algorithms in order to tokenize the tweets and that led the tokens with medical terms to have a higher weight and when the rest of the data was processed against those weights, it helped to indicate tweets that may have had mention of a drug.

I bring all of these examples up, because the current set of AI that has been lauded at this point is very much related to so much of what I've talked about previously. Large Language Models are a mix of Natural Language Processing, recommender systems and essentially neural networks under the covers. When ChatGPT was first released, I remember sitting in a restaurant hearing people talk about how they were already using it to draft e-mails, write presentations, summarize e-mails, etc. When I heard that and looked at the general output for the few things I had messed around with for ChatGPT, I came to the realization that there was a lot of trust being placed on a tool that was already pretty infant in its technology (though this was with GPT-3.5) and released in a way with promises seemingly made via marketing to the public that it's effectively ready for primetime without question or concern. As OpenAI and their product ChatGPT gained more popularity and notoriety, you heard of instances where it had been used and had "hallucinated" or provided false information to users via the backend APIs. The first instance I remember being presented in a fairly crazy way was the one of Air Canada with the bereavement fare policy. It's crazy to me to think that these types of assistants were put into practice without some of the rigor that even machine learning had been put into practice previously.

Companies like FedEx and UPS, big logistics companies spent millions if not billions on machine learning models to optimize routes for their drivers to shave seconds off of route times. Customer Service has historically been a very human interaction with potentially some sort of automation to point people in the right direction, to potentially parrot a true knowledgebase of data available in some database, but not to generate human-like chat responses for people. My big concern with these new flavors of AI is that they are being put into practice without the thought of evaluation taking place. Legacy Machine Learning (gosh, it hurts to say that knowing it's still a huge foundational piece of the current operations in most companies) has always gone through evaluation pipelines. You hear about cross-validation for models, where curated data is set up for validating that a model provides sufficient responses to reduce false positives, or false negatives (depending on the type of model and what it's predicting) to limit the risk of that model being used improperly. In my time with large language models, I've yet to see evaluation metrics for a proposed large language model app or assistant that doesn't eventually rely on a large language model to certify the responses at scale. The techniques and practices from these "legacy" systems were robust and STILL required a lot of work and convincing to get people to even consider trusting them with even data reference-like tasks. But things like ChatGPT come out, are made available to the public, and given high praise from non-experts in the fields and they're seen as perfect in every way by the layperson because "it seems magical" and "it solves my problem and seems to do well".

If you want to know my actual thoughts on AI, it's that AI is very powerful, AI can really make a difference and impact on how we run our every-day lives and the things that we can do to make sure we're using less brainpower and less of our daily existence on tasks that can be automatically handled by something. However the access to these tools really, in my opinion, needs to be limited to people who will continue to use ethical and knowledgeable guidance in using the tools and limiting the exposure of errors for a good chunk of how they are supposed to work. The historic metric for old machine learning models was AUC (Area under the ROC curve) and looking at the confusion matrix to understand the riskiness of a false negative or false positive based on the selected test data. This also typically fed back known data from the model long-term for re-training and re-processing of the machine learning models to continually improve them. I'm not seeing tooling to support the same kinds of capabilities when it comes to the new AI/LLM modeling for the same reason that NLP was still one of the harder problems to solve in the old machine learning systems. Parsing numbers out of a machine learning algorithm in order to handle knowledge of that is different from having to read and analyze the output from a large language model. This is the same reason that larger courses in college focused on multiple-choice tests rather than free response tests. It's so much easier to grade multiple-choice tests because there's machines to automate that, free response requires someone to actually read the content to analyze it. In the end, if you're using a Large Language Model to process the output of a Large Language Model to determine if it's good or not, you might as well just perform consistent circular definitions for all words in the English language.

I'm excited for a tool that can help with some of the harder language-based problems (and human communication-like generation) but like most things I fear that the technology in the wrong hands will lead to greater distrust of the systems or utilization in manners that cause more distrust to be sewn into our world in time. However, I will still not use AI to generate text-based output for my ideas or as an editor, I've previously attempted to perform these things with AI and it's removed my voice from the content that I had initially put together making it seem as if it's not really me saying those things or missing very important clarifying statements and potentially moving things to be misleading. Having an editor that changes the meaning of sentences means you don't have a great editor, and that's what I've seen at this point with most of the LLMs.

Disclaimer: The only portion of this article that used any AI to be put together was the image that was created with the link to the chat conversation I had with Gemini to have it generated.