Meyer began to feel as if he knew the people personally — the ones who described smells in terms of tea and fruit, or meat and gasoline, or blue Powerade and lollipops. The way they described their senses felt so intimate, he said later, “you could almost see the type of person they are.” He was becoming convinced that people believe they are bad at describing smells simply because they so often are asked to do so in labs, sniffing single, isolated molecules (when the more familiar odor of coffee is a blend of many hundreds of them), cloistered away from the context of their real lives and the smells that actually mattered to them. Given the right opportunity, he said, “people become very, very verbal.”
For Meyer, an IBM researcher who specializes in using algorithms to analyze biological data, and who was one of the people who insisted that the G.C.C.R. surveys should include open text boxes, this was exciting news. For years, scientists studying smell have been working off just a few, deeply deficient data sets that link different chemicals and the way humans perceive them. There was, for example, a record created in the late 1960s by a single perfumer, who described thousands of smells, and study after study relied on a single “Atlas of Odor Character Profiles,” published in 1985. It drew on the observations of volunteers who had been asked to smell various single molecules and chemical mixtures, rating and naming them according to a supplied list of descriptors that many scientists felt was flawed and dated.
More recently, Meyer and many others had been using a new data set, painstakingly created by scientists at the Rockefeller University in New York and published in 2016. (I visited the lab in 2014, while Leslie Vosshall and her colleagues were building their data, and was surprised to find I could “smell” one of the vials, though it probably just triggered my trigeminal system. When I told Vosshall that it seemed minty, she replied: “Really? Most people say, ‘Dirty socks.’”) But while the new data set was a significant improvement — 55 people smelled 480 different molecules, rating them by intensity, pleasantness, familiarity and how well they matched a list of 20 descriptions, including “garlic,” “spices,” “flower,” “bakery,” “musky,” “urinous” and so on — it was still a sign of how limited the field was.
This was why Meyer, along with his colleague Guillermo Cecchi, pushed for those open text boxes in the G.C.C.R. survey. They were interested in the possibilities of natural language processing, a branch of machine learning that uses algorithms to analyze the patterns of human expression; Cecchi was already using the technology to predict the early onset of Alzheimer’s, when it is most treatable, by analyzing details of the way people speak. Many researchers had written about the possibilities of using artificial intelligence to finally make a predictive olfactory map, as well as to look at links between changes in olfaction and all the diseases to which those changes are connected, but sufficient data was never available.
Now Covid had provided researchers with a big, complicated data set linking olfactory experience and the progression of a specific disease. It wasn’t constrained by numerical rankings, monomolecules or a few proffered adjectives, but instead allowed people to speak freely about real smells, in the real world, in all their complex and subjective glory.
When Meyer and Cecchi’s colleague Raquel Norel finished analyzing the open-ended answers from English-speaking respondents, they found, with surprise and delight, that their textual analysis was just as predictive of a Covid diagnosis as people’s numerical ratings of smell losses. The algorithms worked because people with Covid used very different words to talk about smell than those without it; even those who hadn’t fully lost their olfaction still tended to describe their sensations in the same ways, repeating words like “metallic,” “decayed,” “chemical,” “acid,” “sour,” “burnt” and “urine.” It was an encouraging finding, a proof of concept that they couldn’t wait to explore in a lot more depth — first in the G.C.C.R. responses in other languages and then, in the future, in other data sets related to other diseases. Meyer got excited when he talked about it. “Anything where smell changes,” he told me. “Depression, schizophrenia, Alzheimer’s, Parkinson’s, neurodegeneration, cognitive and neuropsychiatric disease. The whole enchilada, as they say.”
I had a hard time imagining the olfactory “map” that scientists have dreamed of for so long. Would it, I asked Mainland, look something like a periodic table? He suggested I think, instead, of the maps that scientists have made of “color space,” which arrange colors to show their mathematical relationships and mixtures. “We didn’t know how useful color space was until people started inventing things like color television and Photoshop,” he explained, adding that the map itself isn’t the goal, but rather the ability to use it to understand why we smell what we do. After that, what will be really interesting are the applications we can’t yet imagine. “It’s hard to understand the utility of the map,” he said, “until you have the map.”