How AI is helping us fight COVID
If you were to summarize each year of your life with one single word, what would those words be? Each word you would choose would probably be closely tied to the era that spawned it, and therefore looking back at the list would be like flipping through a yearbook of your journeys! According to Global Language Monitor, an American data-research company that tracks trends in the worldwide use of the English language, the most popular word of 2020 was, unsurprisingly, “COVID”. The COVID-19 pandemic has changed our lives drastically, but in fact, pandemics are not new to our world. The Spanish flu from 1918 to 1920 claimed 100 million lives and the Black Death in the 14th century was fatal for more than 75 million people (about twice the population of California). Nonetheless, what makes COVID different is that whilst viruses are still around, our technologies have advanced, and most notably, Artificial Intelligence (AI) was created, which leads me to ask this particularly important question: can AI help us in our fight against COVID?
AI has been widely used in the field of medical image processing, even long before the COVID-19 pandemic. Deep learning and neural networks have been explored to aid in the segmentation of anatomical features in x-ray, CT, MR and other medical imaging modalities, as discussed in Shen et al. (2017) and Maier et al. (2019). According to Lundervold and Lundervold (2019), such segmented anatomical features have enabled the development of diagnosis, and health-outcome prediction systems for a wide array of medical conditions. Furthermore, AI image enhancement tools can highlight clearer visualization of diagnostically relevant structures in medical images, such as the detection of the presence of COVID-19 infections in x-ray and CT images suggested in Panwar et al. (2020).
Alongside the use of AI in medical image processing, machine learning algorithms are used along with Big Data to enhance the prevention and containment of COVID-19. Punn et al. (2020) employed COVID-19 data from the John Hopkins database to train a predictive model using data attributes such as the location of recent transmissions, and the ratio of recovery and death rates. Such predictions allow for rapid decision making with regards to preventive measures, including the extent of lockdown restrictions, and the allocation of urgent healthcare resources.
However, if we circle back to language, one can ask how language has evolved since the birth of AI? Indeed, Natural language processing (NLP), a sub-field of AI, has a wide range of applications, from statistical machine translation to voice recognition, and from sentiment analysis to automatic text generation – to name but a few! Whilst these rapid technological advancements might make our lives easier by letting us send a voice text while running or driving, the possible benefits are endless. As suggested in Esteva et al. (2019), clinical voice assistants have been developed to record patient visit information into electronic health records, reducing the time and effort a clinician spends on documentation, and increasing time with patients. An AI-based medical assistant android app called IntelliDoctor was developed to predict future medical concerns based on users’ symptoms and medical history. Jensen et al. (2019) even suggested that this concept be extended to develop a comprehensive clinical assistant that can carry out initial medical screening in a bid to reduce unnecessary patient-doctor interactions during the global pandemic.
NLP technologies are now even able to predict potential viral mutations by generating protein sequences and thereby understanding how the mysterious coronavirus evades our immune system. In a recent paper published in Science, Bonnie Berger, a computational biologist at MIT, made a groundbreaking attempt to model coronavirus mutations using NLP. Berger and her colleagues modelled the viral mutation process using two classic linguistics concepts: Syntax and Semantics.
Syntax refers to the set of rules or principles that govern the structure of sentences in each language. It refers to the arrangement of words and phrases to create well-formed sentences. For example, any English speaker would be able to tell that the sentence “I ate an apple” is grammatical correct, whereas the sentence “I an apple ate” is not. This example refers to the concept of word order, i.e., the accepted sequence of words and entities in a language. English, like many Indo-European languages, has a word order of SVO (Subject-Verb-Object). It follows that, by default, the verb of an English sentence must come in between a subject and an object. While this might seem natural to native English speakers, speakers of other languages might find this strange and unnatural. Japanese, for instance, has a word order of SOV (Subject-Object-Verb). The very same sentence would be translated word-to-word in English as “I an apple ate”. (Watashi-wa-ringo-o-tabemashita). In fact, SOV languages account for over 40% of all recognized languages on Earth, even higher than SVO languages like English and French, which only account for approximately 35%. This being said, how could Berger’s team model viral mutations by using syntax? The genetic or evolutionary fitness of a virus can be loosely interpreted as its ability to infect a host. The greater the genetic fitness, the higher the chance the virus can successfully infect a host. This is analogous to grammatical correctness: a booming, infectious virus with a correct syntactic structure is grammatically correct; an unsuccessful, noninfectious virus, is not.
Semantics refers to the branch of linguistics concerned with the study of meaning, truth, and implications. Consider the following sentence that is familiar to all linguists: “Colorless green ideas sleep furiously”. This is a famous example composed by Noam Chomsky in his 1957 book of Syntactic Structures. All English speakers can tell you that this sentence is grammatically correct, just like “I ate an apple.” Yet all of them would also tell you that this sentence is complete nonsense – it simply is semantically meaningless, as ideas don’t have colors, let alone the ability to sleep furiously. This example shows that a perfectly grammatically correct sentence can be semantically pointless. In other words, for a sentence to be meaningfully grammatical, it must respect both syntactic and semantic rules.
The same applies to viruses. Mutations induce changes that alter the appearance of viruses, such as changing the structure of the surface of proteins and removing parts of the nucleotide. Viruses that have undergone mutations are said to have “changed their meaning”. The rapid mutation of coronavirus is becoming a global concern as it is accompanied by an alarming increase in the infection rate and a skyrocketing need for medical resources. This is because certain mutations allow the virus to avoid being detected by antibodies in our immune system, a process known as viral immune escape. The ability to predict and identify these mutation patterns has therefore become more crucial than ever. NLP models encode words in a mathematical space such that words with similar meanings are closer together than those with vastly different meanings. This is known as “embedding”. In the context of viruses, the embedding of their genetic sequences would group them with respect to the degree of similarity of their mutations. The high-level goal of Berger’s study is to locate such mutations that might let a virus escape our immune system without compromising its ability to infect hosts – mutations that alter a virus’ meaning whilst the meaning remains grammatically correct.