Do androids taste electric wine?
Professor Barry C Smith explores the philosophical implications and practical possibilities, for good and ill, of the increasing use of AI in the wine world.
Scarcely a week goes by without us hearing of an advance in artificial intelligence or a new application for it, whether it be medical breakthroughs in interpreting X-rays, DeepMind’s protein folding AlphaFold leading to drug discoveries, or the astonishing, humanlike performance of large language models (LLMs) like Open AI’s Chat GPT or Google’s Gemini. It is all the more remarkable to realize that LLMs can generate screeds of relevant and articulate prose when all they have been trained to do is predict the next word or token in a sequence—albeit in a highly constrained algorithmic way. Perhaps auto-complete is cleverer than we thought, or maybe our ability to string words together into sentences is just another form of auto-complete.
As we navigate the technological landscape of the modern era, one field that seems to remain untouched by the AI revolution is wine tasting. However, it’s uncertain if this will always be the case. The ever-improving capabilities of AI technologies make it difficult to dismiss the possibility that they could significantly alter the wine industry. Despite the excitement this technology stirs, it’s prudent to analyze both the current situation and what the future may hold.
Examining the current landscape entails understanding precisely what AI means and differentiating it from general digital technologies. Not every digital application or algorithm qualifies as AI. Some are merely data aggregators without the capability to forge new insights or enhance decision-making. On the other hand, true AI systems surpass the vast datasets they train on. They detect patterns through machine-learning algorithms, which allow the AI to adapt to new scenarios. Such capabilities enable AI to mimic complex processes, like analyzing the architecture of folk songs, predicting recidivism rates, or assessing someone’s credit score. Moreover, AI’s applications extend beyond human-related phenomena; it also forms large statistical frameworks that assist in disease progression tracking or refining weather predictions—technologies that indirectly aid viticulturalists by improving agricultural practices from satellite monitoring of fields and soil to the evaluation and selection of crops.
Yet a significant query persists: can AI ever fully replicate or surpass human roles in the wine industry, specifically in tasks requiring in-depth knowledge and sensory appraisal, such as tasting, discerning, and valuing wines? Might a sophisticated AI eventually outperform even the most skilled wine experts?
There’s reason to be doubtful when it comes to wine tasting. The act of tasting involves smell, taste, and touch—three senses absent from online interactions. Technology enthusiasts often argue that with our significant screen time, we could digitally record our entire lives for storage in the cloud. But can we truly capture every aspect? Remember, the scenes like those we enjoy in movies are solely composed of sounds and visuals, lacking any smell or taste to experience. More often than not, we’re smelling the popcorn of other moviegoers rather than the scents from the screen. Touch may sometimes be replicated through the vibration of a smartphone; however, the tangible aspects of actions like photography are hardly reproduced.
It’s revealing to consider how contributions from our chemical senses are overlooked as we envision converting all life experiences to digital form. This underscores the overlooked status of such senses in daily life and our screen-bound existence. Consequently, taking a break to savor a wine off-screen, appreciating its aromatic complexity, smooth texture, and lingering taste, feels like a freeing act, a poignant reminder of our physical essence. These instances support what artist Olafur Eliasson describes as the counter-numbness movement.
Is this perspective overly bleak? Is the lack of physical senses in digital realms merely a temporary limitation? Those with a strong belief in data and technology—a group that includes techno-optimists and data evangelists—think so, convinced that expanding datasets and enhancing algorithms will soon overcome these barriers. Philosopher David Chalmers argues in his book Reality + that virtual reality (VR) will evolve to include these missing dimensions because improvements in VR’s foundational physics mechanisms are anticipated, potentially enabling handling of other senses. Chalmers posits that, like physical reality, VR may eventually offer a complete and authentic context for living, accommodating the entire human experience.1 He even suggests that VR worlds indistinguishable from reality may be available within the next century.
While technology may progress substantially, is it feasible to replicate experiences such as wine tasting virtually? I remain skeptical. Tasting involves actual liquids interacting with real taste buds, and smelling necessitates real scents reaching actual noses. Implementing devices on VR headsets that emit wine aromas and release taste components might mirror some physical interactions, but these still depend on genuine sensory engagements. Instead of creating complicated simulations, why not simply hand someone a glass of wine? The limited adoption of avatars for work or social interaction in settings like the Metaverse highlights another shortfall—these virtual gatherings lack tangible elements like coffee or evening drinks, emphasizing the gap between digital and physical experiences.
It seems like an insurmountable task. We typically rely on our senses to interact with virtual objects, like bottles and glasses displayed on a screen. However, our senses of smell and taste present a unique challenge in the digital realm. For visuals, a computer-generated image can convince our eyes of the “reality” of a 3D table or any object. But what about scent? If virtual aroma were a thing, it would require computer-generated signals to trigger our sense of smell. Can a digital creation replicate the sweet scent of a rose effectively? It appears that certain experiences, particularly those involving smell and taste, remain profoundly resistant to digital replication.
Despite AI’s limitations in replicating the tactile experience of wine tasting, the conversation surrounding wine can be fully digitized. Wine writers and enthusiasts frequently translate their sensory experiences into text, which can be analyzed for insights. This text-based information could potentially be leveraged by machine learning to identify patterns in wine descriptions and generate new content about wines not included in initial datasets. Such an endeavor raises interesting questions: Can AI learn to mimic the expertise of seasoned wine connoisseurs? Can it provide us with genuine insights into the qualities of wine?
A parallel challenge exists in the perfume industry. According to AI researcher Alex Wiltschko, while computers have mastered the digital representation of sight and sound, the digital representation of scent remains elusive. Wiltschko leads Osmo, a startup focused on digitizing scents to transform how we capture, transmit, and recall them. A recent advancement by Wiltschko and his team involves training a deep-learning model to map molecular structures to scents. Their method uses a graph neural network trained on perfume experts’ odor descriptions. According to their published paper in Science, the model was not only able to replicate the scent labels provided by perfumers—it could also predict the scents of newly synthesized molecules with a high degree of accuracy. This success demonstrates the burgeoning potential of AI in fields like olfaction, hinting at a future where the aroma might be as digitizable as sight and sound.
The results are impressive, but there are important limitations when it comes to comparisons with wine. The deep-learning algorithm enables the model to generate odor-quality labels for odor molecules, but these are single molecules. Most odors we perceive are mixtures, and the perception of odor mixtures is notoriously hard to explain, both at the receptor level and at the cortical level of the olfactory bulb. Olfactory researchers Joel Mainland of the Monell Center and Richard Gerkin of Osmo are currently adapting the Principal Odor Map’s model to generate accurate odor descriptions for two or three odor combinations. At this point, however, it is worth remembering that wine aromas of any interest will contain more than 800 volatile odor compounds.
Importantly, the AI system that generates the Principal Odor Map is trained without ever sampling odors. No nose was involved in the building of this model. It produces a linguistic mapping from one representational format to another, learning to associate vector representations of molecular structures with linguistic descriptions produced by perfumers.
The trouble comes because researchers writing about digital olfaction have a tendency to elide the difference between odors and their labels. We are promised “a generalized map of structure-odor relationships,” though what we are really dealing with are computational proxies for these items. There is no sensing, which raises the question of whether an AI-generated structure-odor map can tell us how a molecule smells. It can if the model unlocks the relationship between a molecule’s structure and its odor and if the descriptions of its odor are a good guide to what we would smell.
What we find in the training set and the outputs are labels that practicing perfumers use to describe a molecule’s odor quality. Do such labels equip us to know how a molecule smells? If I tell you a given molecule smells creamy, floral, ethereal, and green, do you now know how it smells? I suspect not. Similar considerations apply to the tasting notes wine professionals use to convey how a wine tastes. As I argued in a previous issue of this title, a sip (or a sniff) is worth a thousand words.3
The intriguing question that launched the investigation was: When experts classify the aromas or flavors of molecules or wines, what specific elements are they detecting? Is it possible for a deep-learning model to shed light on what these experts perceive? If the model identifies critical features from its training data that perfumers or wine tasters utilize in their assessments, perhaps it could. However, whether AI employs the same cognitive processes as humans to understand these scents is still uncertain—we simply do not have enough evidence yet. The manner in which an AI arranges its internal layers to process input and establish an order in the scent world is not clear. Although an AI generates predictions by discerning statistical patterns among complex molecular representations and associated human odor classifications, its functioning remains enigmatic, much like other deep-learning models.
This uncertainty does not imply that we cannot determine the essential chemical compounds that create appealing aromas like undergrowth, truffle, licorice, mint, and berries in aged red Bordeaux wines. Researchers have found that some scents are more intense due to specific amounts of chemicals such as dimethyl sulfide, 2-Furanmethanethiol, and 3-Sulfanylhexanol, which points to the potential to uncover the chemical foundations of these wine characteristics. Analytical chemists and sensory scientists reached these insights by correlating the aromatic profiles assessed by a tasting panel with the wines’ chemical compositions, discovered through techniques like gas chromatography–mass spectrometry. There is hope that AI could be trained to connect wine aromas identified by panels with their chemical profiles, with the goal of expanding this recognition to new wines beyond the training set. Whether this is achievable remains to be seen.
More optimistically, AI trained on chromatograms could potentially identify a wine’s estate and vintage. A joint study by Geneva neuroscientist Alexandre Pouget and Bordeaux chemist Stéphanie Marchand already shows promise. After analyzing chemical data from different Bordeaux vintages and estates, they successfully trained a model to match unknown samples to specific properties. This research also highlighted distinct chemical signatures that distinguish between wines from the Right and Left banks of the Gironde, indicating vintage variations as well. This suggests that the intricate blend of a vineyard’s elements might indeed carry a tangible identity, detectable by both AI and skilled human tasters, though likely through different means. While the working logic of deep learning remains elusive, there is discernible proof that both entities can identify defining traits in their unique ways. Perhaps in the future, machine learning could even replace traditional blind tasting.
Today, the proliferation of wine apps marks a significant application of digital technology within the wine industry. Could this tech movement be transformative? Initially, apps like Vivino and CellarTracker began as tools for offloading cognitive tasks, helping users decide which wines to purchase. This isn’t a novel scenario; traditionally, consumers confused by diverse wine selections often turned to expert critics for advice. More and more, people are now relying on digital apps to aid their selections, veering away from sommelier or critic suggestions that may not align with their personal preferences.
Wine applications function as recommendation systems, similar to those utilized by companies like Amazon and Netflix, suggesting that if you enjoyed one product, you might also enjoy another. Such systems operate on a “wisdom of the crowd algorithm,” which leans on what most people or people similar to you favor, without questioning the reasons behind these preferences. The aspect of “people like you” is crucial as it elevates the model from a simplistic scenario, typically joked about: a machine-learning algorithm walks into a bar, the bartender asks, “What’ll it be?” and the algorithm responds, “What is everyone else having?” These algorithms presume that a sufficient amount of data exists within consumer purchase patterns and preferences to make trustworthy predictions on other liked items. Initially, they aggregate consumer ratings data, using it to create statistically averaged predictions. Wine apps like Vivino provide overall wine ratings from users who scan labels and offer personal reviews; similarly, CellarTracker compiles thousands of reviews from its community, presenting scores, prices, and wine availability.
The advantages claimed for these systems include: (i) algorithms amalgamating multiple opinions that are likely to perform better than relying solely on a few esteemed wine critics; and (ii) the subjective preferences being adequate for crowd-based judgments.
Regarding point (i), it’s often mentioned that wine critics’ opinions may differ from general consumers because their tastes or preferences diverge, leading them to not prefer the same wines as the average consumer. A common notion is that “customers with less wine knowledge often prefer less expensive wines,” which suggests that wine sellers should pay closer attention to these consumers who typically purchase less expensive options, rather than insisting on the superiority of pricier wines, which may not be as well-liked.
Concerning point (ii), the concept of the wisdom of the crowd is utilized to explain why a collective opinion, or an aggregate response to a quantitative query, is generally more accurate than individual responses. As noted by Francis Galton, when a group is asked to estimate something like the height of the Eiffel Tower or a bull’s weight at a fair, despite varied individual guesses, the averaged estimate often approximates the correct answer more closely than individual guesses. The averaging process smooths out discrepancies by reducing noise. However, accepting the subjectivity of taste—as all opinions being equally valid or invalid—questions how an aggregated liking value corrects errors or reduces discrepancies. What does this collective decision more accurately reflect? If all opinions hold equal validity, why should the crowd’s opinion weigh more than an individual’s? It’s argued that aggregated scores, rather than reflecting the crowd’s wisdom, depict the crowd’s preferences, particularly emphasizing highly-liked, frequently-purchased wines. This is based on how much such wines are liked by others who share similar wine tastes as the individual, making their judgments likely reflective of what the individual will enjoy. This rationale supports the growing popularity of wine recommendation apps.
Can the practice of averaging consumer preferences at specific price points truly aid an individual in expanding their wine knowledge, or is there a more nuanced method of exploration? More crucially, can these methods not only assist in researching known wines but also in uncovering new, fascinating choices?
Two primary methodologies exist for creating wine recommendation systems: collaborative filtering and content filtering. Collaborative filtering analyzes consumer data to identify patterns in preferences, purchases, and recommendations, emphasizing on similarities to predict tastes based on shared preferences. This approach has evolved with the advent of machine learning through matrix factorization techniques that determine abstract characteristics of both users and wines, though these characteristics are not transparent in how they influence taste predictions.
In contrast, content filtering focuses on the attributes of the wines themselves, such as type of grape, origin, and production methods like biodynamic processes. This method recommends wines that share similar features, enhanced by user ratings that determine the significance of these attributes for specific users. Essential to content-based filtering are the tasting notes provided by users, which serve as vital data for applying modern natural language processing technologies.
The accuracy of predicting consumer preferences significantly increases when these two approaches are combined into hybrid recommender systems. By integrating predictions from both collaborative and content-based filters using a hybrid algorithm, the system can provide more precise recommendations.
The effectiveness of algorithms on extensive datasets can be extraordinary, yet the question remains: how effectively do wine applications serve the learning and tasting aspirations of wine enthusiasts compared to professional wine evaluations? Also, what improvements are possible?
There exists a variety of responses to this inquiry. Paraphrasing musician David Byrne who commented on music recommendation services, these platforms often foster popularity over quality, a trend that might also apply to wine recommendation apps guiding users towards well-known and often more profitable selections.
During a noteworthy conference focused on judgment at Oxford’s Institute for Ethics in AI, I engaged over 100 non-experts in a wine tasting test. Participants sampled a French Sauvignon Blanc and compared it against two additional Sauvignon Blancs. When asked which wine they would recommend to someone who liked the initial wine, most chose the other French variety over a pricier New Zealand option recommended by the Vivino app. This highlights the app’s strategy to upgrade customers’ selections within a category but at a higher price, often driven by commercial partnerships with wine sellers. Moreover, whereas a wine critic might suggest less acidic Rhône whites or an exceptional Condrieu for those exploring different tastes, such autonomous recommendations by an app are improbable.
The gap between expert wine ratings and those from wine apps may not be as clear-cut as it appears. A study featured in Vox magazine in 2018 by Mark Schatzker and Richard Bazinet reviewed correlations for California wines, showing a Spearman correlation of 0.576 between CellarTracker and Robert Parker’s Wine Advocate, and a correlation of 0.424 between CellarTracker and Jancis Robinson’s website for another subset. As per Eric Levine, the founder of CellarTracker, the average user rates 49 wines, with 2,311 users rating over 500 wines, indicating a more informed crowd whose reviews more likely reflect wine quality than mere personal preference, especially for higher-priced wines. This suggests a meaningful convergence in opinions among more knowledgeable enthusiasts, much like professional wine tasters.
Wine applications utilize algorithms to analyze consumer behaviors and preferences to predict other wines they might enjoy. However, these algorithms do not reveal the reasons behind those preferences. This limitation results in a gap that can be filled by wine experts, including winemakers, wine merchants, sommeliers, and writers. These professionals might not always align with the personal tastes of the general public, but they possess a deep understanding of the characteristics that link the variety of wines a consumer enjoys, buys, or recommends.
Research conducted by Ophelia Deroy and computer scientist Pantelis Analytis explored whether the higher consistency and detailed discrimination of experts could surpass the collective wisdom of a larger, diverse group when incorporated into a system for collaborative-filtering recommendations. Bordeaux wines serve as an excellent subject, rated both by experts globally and wine lovers alike. Although experts frequently participate in thorough wine tastings, their feedback creates richer datasets for data scientists to investigate. On the other hand, data from casual wine lovers is less robust, with around 120 users who have each assessed a minimum of 50 wines included in the study. The evidence collected from the detailed assessments of 13 well-known experts offered a much richer individual data pool.
When implementing a collaborative-filtering algorithm, the analysis first evaluated the taste preferences of wine experts against those of casual wine enthusiasts across approximately 2,000 Bordeaux wines. The findings highlighted noticeable differences between experts and regular app users; if you are not an expert, your taste in Bordeaux is usually more aligned with that of other enthusiasts rather than renowned critics like Jancis Robinson MW, Robert Parker, or Decanter reviewers. However, experts displayed greater consistency in their evaluations, showing a higher level of agreement among themselves compared to the aficionados.
When we merge ratings from both experts and consumers, what’s the impact? In a test, a system that sourced advice solely from experts outperformed one that used only consumer input when predicting preferences. The technique involved identifying the most similar experts and consumers, then consulting their opinions for predictions. Collaborative filtering suggests that a majority benefit more from recommendations influenced by a small group of critics rather than a broader base of over 100 consumers. Even a slight advantage here can have significant market effects.
Wine applications like Vivino and CellarTracker aim to enhance their recommendation engines by encouraging more reviews from everyday users, integrating more complex collaborative-filtering and content-based methodologies, or combining both. Platforms like Preferabli, which utilize extensive data and content-filtering algorithms focused on expert evaluations from Master of Wine and Master Sommelier to adjust to user preferences, show the potential of expert data. Real collaborative filtering might be achieved by effectively harmonizing the perspectives of both experts and enthusiasts.
Incorporating expert opinions with consumer feedback might provide a more accurate prediction of wine preferences than relying on either group alone. Instead of solely focusing on gathering numerous amateur ratings and advancing algorithms, another improvement could be to unify consumers and wine experts on shared platforms. Even if personal tastes differ from experts, there’s a significant advantage in their consistency and extensive knowledge, which helps in predicting new wines that users might enjoy. The benefits of integrating an expert-enhanced dataset in making combined quality assessments and personalized recommendations are critical.
Ultimately, leveraging a combination of expert and amateur ratings not only aims to refine prediction accuracy but also explores new ways to shape and enhance consumer tastes by introducing them to unique and varied wine selections. This could very well be a glimpse into the future of wine tasting and recommendations. Let’s look forward to it.
References
1. David J Chalmers, Reality+: Virtual Worlds and the Problems of Philosophy (Penguin, London; 2022), p.xvii.
2. Brian K Lee et al, “A Principal Odor Map Unifies Diverse Tasks in Olfactory Perception,” Science 381 (2023), pp.999–1006.
3. Barry C Smith, “Is a Sip Worth a Thousand Words?” WFW 21 (2008), pp.114–19.
Magali Picard, Cécile Thibon, Pascaline Redon, Philippe Darriet, Gilles de Revel, and Stéphanie Marchand explored the impact of Dimethyl Sulfide and various polyfunctional thiols on the olfactory character associated with aged red Bordeaux wines in their study published in the Journal of Agricultural and Food Chemistry in 2015.
Research by Michael Schartner, Jeff M Beck, Justine Laboyrie, Laurent Riquier, Stéphanie Marchand, and Alexandre Pouget, presented in 2023 in Communications Chemistry, discusses a method to identify the origin and vintage of Bordeaux red wines using raw gas chromatograms.
In the 2014 issue of WFW, Steve Slatcher elaborates on the subjective nature of wine tasting and appreciation in his article “Subjectivity in Wine Appreciation.”
Barry C Smith’s 2019 article in Current Opinion in Food Science debates the enhanced experience of wine through expert opinions, applications, and advances in sensory science under the title “Getting More Out of Wine: Wine Experts, Wine Apps, and Sensory Science.”
Pantelis P Analytis, Karthikeya Kaushik, Stefan Herzog, Bahador Bahrami, and Ophelia Deroy, “A Recommender-Network Perspective on the Informational Value of Critics and Crowds,” https://arxiv.org/html/2403.18868v1#bib (2024).