Data Accumulation and Variability

  • Measured performance (yield) of any product is the result of the combined effects of the genetics of the product and the environment in which it is tested. 
  • Even the most superior products will not win at every plot every time. 
  •  Greater quantities of yield data will likely give a clearer picture of a product’s actual yield potential.

When looking over plot data, one may question why the product with the best average yield across multiple locations, does not have the highest yield at every location, why it does not win every plot, or possibly why it's average ranking across multiple locations changes as the harvest season progresses. The answer to all these questions is a combination of variability and probability. Measured performance (yield) of any product is the result of the combined effects of the genetics of the product and the environment in which it is tested. One must always keep in mind that yield trials deal with many variables that can contribute to yield performance. Average yields can also change as more data is accumulated across locations. Greater quantities of yield data will likely give a clearer picture of the actual yield potential.

Variability of Observations

 

Most genetic traits that contribute to yield are quantitative traits, which means they are controlled by multiple genes that each provide a certain percentage to the overall characteristic. Observations of any quantitative trait, such as the height of an individual or yield for a plot or strip trial, follow a bell-shaped curve (Figure 1). This variation is due to the interaction of environment and genetics, as the environment can have an effect on each of these genes independently and in different ways. Most observations cluster or fall close to the mean, but some observations, usually 5% or less, appear to be very different from the mean. These values are not wrong or incorrect, but just part of the natural variation seen in any population. For example, consider height for men in the United States. If the average height is 70 inches, most men will be between 66 and 74 inches tall, but a few will be much taller and a few others will be much shorter. Yield measurements of any particular product will follow a similar pattern. 

When comparing two products, yield observations for each will fall into a bell-shaped curve around their means, as illustrated in Figure 1. The means of product A and product B are different, but there is an overlap between the two products. You can see that a specific observation for product B may be higher than product A although the overall mean of product A is higher. This may be a response to the specific environment, or it may just be due to chance. Some environments may favor one product over another, resulting in a higher plot yield for product B than for product A, even though product A is the better overall performer. Environmental factors such as excess moisture, drought, or disease may favor product B over product A, while in a different environment the opposite might be true based on the individual product’s response to the environment. Some differences may also be due to experimental error. Plant populations may differ slightly due to germination or planting differences. Like any statistical calculation, the more observations that are evaluated, the higher the confidence that the mean calculated represents the true mean of the population.

Figure 1. Yield distribution and individual yields of eleven individual plots among two seed products. Figure 1. Yield distribution and individual yields of eleven individual plots among two seed products.

Importance of Having All the Data

Because of the inherent variability of observations, the initial data that is reported may not give a clear picture of the actual average performance of a product. Figure 2 charts the percent accumulated data against the rank correlation among entries in a yield plot. As harvest season begins and limited field data is collected, the correlation is low. As harvest season progresses, yield data begins to accumulate and the correlation becomes stronger and moves closer to a value of 1. Towards the end of the harvest season, the correlation increases dramatically to over 90% as the large quantity of accumulated data gives a better estimate of the true yield potential of a product and its rank among the other field plot entries. Thus with very little data accumulated, the rankings of various products within that yield plot when compared to rankings across other plots can be extremely variable. As the data is accumulated, how a particular product will rank in the plot will likely become more consistent and provide a better estimate of the true yield potential.

The more data that is accumulated on a given product the more stable its ranking among multiple products becomes. In the example in Figure 1, there are eleven individual yield data points of each product and all values are within their given distributions of the true mean yield. However, within this limited data set, seven of the eleven individual yields of Product B are closely centered or greater than the true mean yield of Product A, while only four out of eleven individual yields of Product A are greater or centered around its own true mean. While the actual mean of Product A will be greater than Product B once all the data is collected, as shown by the position of the dotted line, it would initially appear that Product B is the higher yielder based on this limited data set.

Figure 2. Correlation between yield rank and percent data accumulated. Figure 2. Correlation between yield rank and percent data accumulated.

Rolling the Test-Plot Dice

Probability is another factor in determining the winner in a test plot. The chances of a product winning depend on how many other products it is compared to in the trial, the relative difference in their yields or the superiority of one product over another. If two products have equal yield potential the odds of either one winning are similar to 50:50 odds of heads or tails when you toss a coin. If you toss a coin 10 times, you will not necessarily find the results are five heads and five tails. The same is true of a yield trial. If two products are equal in yield, the odds of either one winning are 50%. If one product is superior, it will win more frequently, but a win is still not assured in every test. Figure 3 shows an example based on soybean data. Note that even if a product has a 4 bu/acre overall yield advantage, it will likely win only 75% of the head-to-head comparisons in a yield plot. Conversely, the product that is actually 4 bushels less can win 25% of the time.

Figure 3. Example of % wins probability vs. the actual yield difference. Figure 3. Example of % wins probability vs. the actual yield difference.

Summary

No product, even if it is truly superior, will win every plot. Over many tests, industry-leading products have typical head-to-head winning percentages of only 60 to 65%. Environmental factors, genetic potential, and test variability constitute the variables that contribute to yield differences across test plot sites. Yield ranks among entries in compiled data sets can also change based on the number of tests and the geographical location of the plots. The more data and comparisons that are assimilated and examined the better picture of yield performance. This more robust picture can increase the degree of confidence one can place on picking a winning product.

 

130823060617

This browser is no longer supported. Please switch to a supported browser: Chrome, Edge, Firefox, Safari.