I have been interested in machine learning since I was a PhD student back in 2010. I was always in awe of papers where they applied machine learning to cluster or classify data. My original PhD topic was to discover biomarkers in blood for Parkinson's Disease (PD) patients that were early onset and had not taken any pharmacological treatment. The overarching goal was to develop a diagnostic test that could be used early on to predict PD in a non-invasive medium because by the time symptoms develop, substantial amounts of neuro-degeneration have already taken place. Essentially, we wanted to publish a study similar to Molecular markers of early Parkinson's disease based on gene expression in blood, but using sequencing instead of microarrays.
However, there were two major issues with the original project: one was due to the fact that we sequenced whole blood and 70-80% of the sequencing real estate was taken up by haemoglobins; we tried to sequence deeper but it wasn't enough. The other issue was due to strand invasion during the template-switching step in our experimental protocol. The paper we would have wanted to write would be something like RNA sequencing of whole blood reveals early alterations in immune cells and gene expression in Parkinson's disease (paywall).
The next project I wanted to apply machine learning to was during my first postdoc. By then I was working on whole exome sequencing on rare disease patients because I wanted to move into the clinical genomics space after my PhD. I had proposed in my fellowship application, which was unsuccessful, to use Random Forests for classifying genetic variants. One of the main problems with finding causative mutations for rare genetic diseases is that there are too many variants. The problem is alleviated a little by the fact that rare disease mutations have a high penetrance but even then there's just too much. I wanted to incorporate functional transcriptomics into the feature space and argued that this would help us in narrowing down the list of potential causative mutations, i.e., making it easier to find the needle in the haystack. It was almost funded but no cigar.
After a hiatus of a number of years in my machine learning journey, I started looking into deep learning. I had purchased "Deep Learning with R" and was slowly going through it because I felt it could be useful in prioritising neoantigens. This was still a couple of years before ChatGPT came out but many people had started to use deep learning for various applications because TensorFlow/PyTorch/Keras had made it easier to use deep learning. A colleague and I started to have regular meetings to chat and discuss about deep learning. I was much more low level (for example I was trying to implement a simple neural network from scratch) because I try to understand the foundations but my colleague discussed higher level topics like the application of deep learning for natural language processing. It was then that I learned about the Attention Is All You Need paper but between the two of us, we didn't have the expertise to fully decipher the technical details. Just about the time ChatGPT came out, my colleague quit, so the deep learning sessions came to an end.
Why all the history!? Since I'm going to give my thoughts on Artificial Intelligence (AI) I thought I'll give you some idea of my (limited) background so you can put my thoughts into perspective. Also I just wanted to write something a bit more personal, since I don't do that so often nowadays. The main point is that I don't have formal training in machine learning (or mathematics or statistics for that matter). I studied microbiology and biochemistry in university and learned bioinformatics pretty much by myself. If you follow this blog, you probably knew that already. Most of what I know about machine learning is shared in my GitHub repository.
With my background description out of the way, I can give my first and foremost thought: I don't like the term AI. I didn't like it before and now with all the buzz, I dislike it even more. The main reason is because AI is a vast field and saying AI could mean various things. I guess nowadays AI is just synonymous with ChatGPT and/or Large Language Models (LLMs) but I would still prefer it if people said ChatGPT and/or LLMs instead of AI. LLMs are built using something called Transformer architectures, which was introduced in the Attention Is All You Need paper that I mentioned earlier. Transformers use a type of deep neural network (a neural network with a lot of layers and parameters) called deep learning. Deep learning is a subset of machine learning. And finally, machine learning is a subset of AI.
Secondly, I do believe that LLMs are useful for certain tasks (debugging code, generating templates, etc.) and as you may have seen in my recent blog posts, I've been exploring different ways to use LLMs offline. As a bioinformatician, I have to know a bit of everything and LLMs can assist me in understanding more of a topic. However, I never just use LLMs to understand a topic because as you probably already know, they hallucinate, i.e., create outputs that are not correct. I believe that AI should really stand for Augmented Intelligence and as nicely illustrated in The LLM Curve of Impact on Software Engineers, LLMs have varying degrees of usefulness depending on your work/position.
But I feel that AI companies don't want to advertise AI as augmented intelligence because they are supposed to completely replace some task/role. This brings me to my major gripe, which is all the hype surrounding AI. Sure, I think LLMs are useful but would it impact me significantly if it didn't exist today? Not really. Perhaps I'm not making the most out of them (and I am actively looking into different avenues of using LLMs) but I don't know a single person who absolutely relies on LLMs or have been replaced by LLMs. And we may never be fully reliant on LLMs until hallucinations disappears because until then we always need to manually double-check the output. In order to be able to double-check the output, we need to have some sort of expertise in the first place.
I was listening to the 404 Media Podcast where they were discussing the Microsoft publication: "The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers". The title of the publication pretty much summarises the findings. In the podcast, they discussed an example of how we no longer remember phone numbers (at least for people born before the advent of mobile phones) because mobile phones have built-in phone books. Like the podcasters, I still remember my home phone numbers (3258831 and 3257747) that we had when I was a kid; nowadays I can't even remember my own phone number. Remembering phone numbers is probably not a good use of our mental capacity and it's fine to outsource this task to our phones. Remembering code syntax (and deciphering error messages) can probably be outsourced to LLMs in the same manner. The higher level tasks such as the logic and design of a program, tasks that require critical thinking, should be done by us.
Recently I have been checking out ellmer and I have pretty much the same stance on LLMs as the authors of {ellmer}:
In general, we recommend avoiding LLMs where accuracy is critical. That said, there are still many cases for their use. For example, even though they always require some manual fiddling, you might save a bunch of time ever with an 80% correct solution. In fact, even a not-so-good solution can still be useful because it makes it easier to get started: it's easier to react to something rather than to have to start from scratch with a blank page.

This work is licensed under a Creative Commons
Attribution 4.0 International License.