Every year Robert Hodgson selects the finest wines from his small California winery and puts them into competitions around the state.
And in most years, the results are surprisingly inconsistent: some whites rated as gold medallists in one contest do badly in another. Reds adored by some panels are dismissed by others. Over the decades Hodgson, a softly spoken retired oceanographer, became curious. Judging wines is by its nature subjective, but the awards appeared to be handed out at random.
So drawing on his background in statistics, Hodgson approached the organisers of the California State Fair wine competition, the oldest contest of its kind in North America, and proposed an experiment for their annual June tasting sessions.
Each panel of four judges would be presented with their usual "flight" of samples to sniff, sip and slurp. But some wines would be presented to the panel three times, poured from the same bottle each time. The results would be compiled and analysed to see whether wine testing really is scientific.
The first experiment took place in 2005. The last was in Sacramento earlier this month. Hodgson's findings have stunned the wine industry. Over the years he has shown again and again that even trained, professional palates are terrible at judging wine.
"The results are disturbing," says Hodgson from the Fieldbrook Winery in Humboldt County, described by its owner as a rural paradise. "Only about 10% of judges are consistent and those judges who were consistent one year were ordinary the next year.
"Chance has a great deal to do with the awards that wines win."
These judges are not amateurs either. They read like a who's who of the American wine industry from winemakers, sommeliers, critics and buyers to wine consultants and academics. In Hodgson's tests, judges rated wines on a scale running from 50 to 100. In practice, most wines scored in the 70s, 80s and low 90s.
Results from the first four years of the experiment, published in the Journal of Wine Economics , showed a typical judge's scores varied by plus or minus four points over the three blind tastings. A wine deemed to be a good 90 would be rated as an acceptable 86 by the same judge minutes later and then an excellent 94.
Some of the judges were far worse, others better – with around one in 10 varying their scores by just plus or minus two. A few points may not sound much but it is enough to swing a contest – and gold medals are worth a significant amount in extra sales for wineries.
Hodgson went on to analyse the results of wine competitions across California, and found that their medals were distributed at random.
"I think there are individual expert tasters with exceptional abilities sitting alone who have a good sense, but when you sit 100 wines in front of them the task is beyond human ability," he says. "We have won our fair share of gold medals but now I have to say we were lucky."
His studies have irritated many figures in the industry. "They say I'm full of bullshit but that's OK. I'm proud of what I do. It's part of my academic background to find the truth.''
Hodgson isn't alone in questioning the science of wine-tasting. French academic Frédéric Brochet tested the effect of labels in 2001. He presented the same Bordeaux superior wine to 57 volunteers a week apart and in two different bottles – one for a table wine, the other for a grand cru.
The tasters were fooled.
When tasting a supposedly superior wine, their language was more positive – describing it as complex, balanced, long and woody. When the same wine was presented as plonk, the critics were more likely to use negatives such as weak, light and flat.
In 2008 a study of 6,000 blind tastings by Robin Goldstein in the Journal of Wine Economics found a positive link between the price of wine and the amount people enjoyed it. But the link only existed for people trained to detect the elements of wine that make them expensive.
In 2011 Professor Richard Wiseman , a psychologist (and former professional magician) at Hertfordshire University invited 578 people to comment on a range of red and white wines , varying from £3.49 for a claret to £30 for champagne, and tasted blind.
People could tell the difference between wines under £5 and those above £10 only 53% of the time for whites and only 47% of the time for reds. Overall they would have been just as a successful flipping a coin to guess.
So why are ordinary drinkers and the experts so poor at tasting blind? Part of the answer lies in the sheer complexity of wine.
For a drink made by fermenting fruit juice, wine is a remarkably sophisticated chemical cocktail. Dr Bryce Rankine, an Australian wine scientist , identified 27 distinct organic acids in wine, 23 varieties of alcohol in addition to the common ethanol, more than 80 esters and aldehydes, 16 sugars, plus a long list of assorted vitamins and minerals that wouldn't look out of place on the ingredients list of a cereal pack. There are even harmless traces of lead and arsenic that come from the soil.
Three of wine's most basic qualities – sweetness, sourness and bitterness – are picked up by the tongue's taste buds. A good wine has the perfect balance of sweet from the sugar in grapes, sourness from the acids, particularly tartaric and malic acid, and bitterness from alcohol and polyphenols, including tannins.
Many wines are more acidic than lemon juice and are only palatable because that acidity is balanced by sweetness and bitterness. "It's the holy trinity of the palate – sugar, acid and alcohol," says Dr James Hutchinson, a wine expert at the Royal Society of Chemistry .
Professionals distinguish between the balance of these three basic elements and a wine's flavour. And here the chemistry gets more complicated.
The flavour of wine – its aroma or bouquet – is detected not by the taste buds, but by millions of receptors in the olfactory bulb, a blob of nervous tissue where the brain meets the nasal passage.
Chemists have identified at least 400 aroma compounds that work on their own and with others to create complex flavours – some appearing immediately on first sniffing, others emerging only as an aftertaste. Most of these are volatiles – aromatic compounds that tend to have a low boiling point and waft away from glasses and tongues towards the olfactory bulb.
Some of these, the primary volatiles, are present in the grape. Others, the secondaries, are generated by yeast activity during fermentation. The rest, the tertiary volatiles, are formed as wine matures in barrels or bottles.
Over the last few decades, wine scientists have begun to identify the compounds responsible for some of the distinctive aromas in wine.
The grassy, gooseberry quality of sauvignon blanc, for instance, comes from a class of chemicals called methoxypyrazines. These contain nitrogen and are byproducts of the metabolism of amino acids in the grape. Concentrations are higher in cooler climates, which is why New Zealand sauvignon blancs are often more herbaceous than Australian ones.
The flowery aroma of muscat and gewürztraminer comes from a class of alcohol compounds called monoterpenes. These include linalool – a substance also used in perfumes and insecticide – and geraniol, a pale yellow liquid that doubles up as an effective mosquito repellent and gives geranium its distinctive smell.
The spicy notes of chardonnay have been attributed to compounds called megastigmatrienones, also found in grapefruit juice.
"People underestimate how clever the olfactory system is at detecting aromas and our brain is at interpreting them," says Hutchinson.
"The olfactory system has the complexity in terms of its protein receptors to detect all the different aromas, but the brain response isn't always up to it. But I'm a believer that everyone has the same equipment and it comes down to learning how to interpret it." Within eight tastings, most people can learn to detect and name a reasonable range of aromas in wine, Hutchinson says.
Detecting and finding the right vocabulary may be within everyone's grasp. But when it comes to ranking wines, Hutchinson shares Robert Hodgson's concerns.
"There's a lot of nonsense and emperor's new clothes in the wine world," Hutchinson says. "I have had a number of wines costing hundreds of pounds that have disappointed me – and a number costing between £5 and £10 which have been absolutely surprising."
People struggle with assessing wine because the brain's interpretation of aroma and bouquet is based on far more than the chemicals found in the drink. Temperature plays a big part. Volatiles in wine are more active when wine is warmer. Serve a New World chardonnay too cold and you'll only taste the overpowering oak. Serve a red too warm and the heady boozy qualities will be overpowering.
Colour affects our perceptions too. In 2001 Frédérick Brochet of the University of Bordeaux asked 54 wine experts to test two glasses of wine – one red, one white. Using the typical language of tasters, the panel described the red as "jammy' and commented on its crushed red fruit.
The critics failed to spot that both wines were from the same bottle. The only difference was that one had been coloured red with a flavourless dye.
Other environmental factors play a role. A judge's palate is affected by what she or he had earlier, the time of day, their tiredness, their health – even the weather.
For Hutchinson and Hodgson the unpredictability means that human scores of wines are of limited value.
"It's very subjective and there's a lot of politics marring it," says Hutchinson. "People should use it as one indicator and not as an end-all. It would be a great sadness if people were only driven by what critics say."
So if people cannot be relied on to judge wine, how about machines?
"In terms of replicating what a human can do we are a long way off," Hutchinson says. "The one thing we can do well, though, is a lot of amazing analytical chemistry that allows us to detect a huge range of different compounds in a glass of wine.
''We can start to have an indication of how the acidity balances with the sweetness and different levels of flavour compounds.
"But the step we haven't got to is how that raw chemical information can be crunched together and converted into something that reflects someone's emotional response. That might be something we can never achieve."
Meanwhile the blind tasting contests go on. Robert Hodgson is determined to improve the quality of judging. He has developed a test that will determine whether a judge's assessment of a blind-tasted glass in a medal competition is better than chance. The research will be presented at a conference in Cape Town this year. But the early findings are not promising.
"So far I've yet to find someone who passes," he says.