Difference between revisions of "Average" - New World Encyclopedia

From New World Encyclopedia
(article ready)
(imported latest version of article from Wikipedia)
Line 1: Line 1:
{{ready}}
+
{{Refimprove|date=September 2008}}
{{Unreferenced|date=October 2007}}
+
In [[mathematics]], an '''average''', or '''central tendency''' <ref>In [[statistics]], the term ''central tendency'' is used in some fields of [[empirical research]] to refer to what statisticians sometimes call "location".</ref> of a [[data set]] refers to a measure of the "middle" or "[[Expected value|expected]]" value of the data set. There are many different [[descriptive statistics]] that can be chosen as a measurement of the central tendency of the data items.
In [[mathematics]], an '''average''', or '''central tendency''' of a [[data set]] refers to a measure of the "middle" or "[[Expected value|expected]]" value of the data set. There are many different [[descriptive statistics]] that can be chosen as a measurement of the central tendency of the data items. The most common method is the [[arithmetic mean]], but there are many other types of averages.
 
  
Colloquially, people often use the term '''average''' to refer to an intuitive '''central tendency''' without having a specific measurement of central tendency in mind, or use terms such as "the average person." However, the phrase "there's no such thing as an average citizen" emphasizes that the average is a ''number'', not a person or some other object. The average is calculated by combining the measurements related to a group of people or objects, to compute a number as being the average of the group.
+
An average is a single value that is meant to typify a list of values. If all the numbers in the list are the same, then this number should be used. If the numbers are not all the same, an easy way to get a representative value from a list is to randomly pick any number from the list. However, the word 'average' is usually reserved for more sophisticated methods that are generally found to be more useful.
 +
 
 +
The most common method is the [[arithmetic mean]]. There are many other types of averages, such as [[median]] (used most often to describe house prices and incomes).  <ref>An axiomatic approach to averages is provided by John Bibby (1974) “Axiomatisations of the average and a further generalization of monotonic sequences,” Glasgow Mathematical Journal, vol. 15, pp. 63–65.</ref>The average is calculated by combining the measurements related to a set and to compute a number as being the average of the set.
  
 
__TOC__
 
__TOC__
Please see the [[table of mathematical symbols]] for explanations of the symbols used.  In [[statistics]], the term ''central tendency'' is used in some fields of [[empirical research]] to refer to what statisticians sometimes call "location." A "measure of central tendency" is either a [[location parameter]] or a [[statistic]] used to estimate a location parameter.  
+
==Calculation==
 +
===Arithmetic mean===
 +
{{main|Arithmetic mean}}
 +
Simply put, if <math>n</math> numbers are given, each number denoted by ''a<sub>i</sub>'', where <math>i=1, \dots ,n</math>, the arithmetic mean is the [[sum]] of the ''a<sub>i</sub>'s'' divided by <math>n</math> or
 +
:<math>AM=\frac{1}{n}\sum_{i=1}^na_i</math>.
 +
 
 +
The arithmetic mean, often simply called the mean, of two numbers, such as 2 and 8, is obtained by finding a value A such that 2 + 8 = A + A. It is then simple to find that A = (2 + 8)/2 = 5. Switching the order of 2 and 8 to read 8 and 2 does not change the resulting value obtained for A. The mean 5 is not less than the minimum 2 nor greater than the maximum 8. If we increase the number of terms in the list for which we want an average, we get, for example, that the arithmetic mean of 2, 8, and 11 is found by solving for the value of A in the equation 2 + 8 + 11 = A + A + A. It is simple to find that A = (2 + 8 + 11)/3 = 7.
 +
 
 +
Again, changing the order of the three members of the list does not change the result: A = (8 + 11 + 2)/3 = 7, and that 7 is between 2 and 11. This summation method is easily generalized for lists with any number of elements. However, the mean of a list of integers is not necessarily an integer. "The average family has 1.7 children" is a jarring way of making a statement that is more appropriately expressed by "the average number of children in the collection of families examined is 1.7".
 +
 
 +
===Geometric mean===
 +
{{main|Geometric mean}}
 +
Geometric mean of <math>a_1, a_2, ..., a_n</math> is defined as
 +
 
 +
<math>GM=\sqrt[n]{a_1 a_2 ... a_n}</math>
 +
 
 +
Geometric mean can be thought of as the [[antilog]] of the arithmetic mean of the [[logarithm|logs]] of the numbers.
  
==Calculating averages==
+
Example: Geometric mean of 2 and 8 is <math>GM = \sqrt{2 \cdot 8} = 4</math>.
An average is a representative value of a list. If all the numbers in the list were the same, then this number should be used. What if they are not the same? There are many different possible answers to this question. The average should not depend on the order of the numbers in the list, and it is often useful to also require that it should not be less than the smaller number in the list, nor greater than the greater number in the list (but see the annualiztion of returns for other than one year in duration).
 
  
An easy way to get a representative value from a list is to randomly pick any number from the list. However, the word 'average' is usually reserved for more sophisticated methods that are generally found to be more useful.
+
===Harmonic mean===
 +
{{main|Harmonic mean}}
 +
Harmonic mean for a set of numbers <math>a_1, a_2, ..., a_n</math> is defined as the reciprocal of the arithmetic mean of the reciprocals of <math>a_i</math>'s:
  
The most common type of average is the [[arithmetic mean]], often simply called the mean. The arithmetic mean of two numbers, such as 2 and 8, is obtained by finding a value A such that 2 + 8 = A + A. It is then simple to find that A = (2 + 8)/2 = 5. Switching the order of 2 and 8 to read 8 and 2 does not change the resulting value obtained for A. The mean 5 is not less than the minimum 2 nor greater than the maximum 8. If we increase the number of terms in the list for which we want an average, we get, for example, that the arithmetic mean of 2, 8, and 11 is found by solving for the value of A in the equation 2 + 8 + 11 = A + A + A. It is simple to find that A = (2 + 8 + 11)/3 = 7. Again we see that changing the order of the three members of the list does not change the result: A = (8 + 11 + 2)/3 = 7, and that 7 is between 2 and 11. This summation method is easily generalized for lists with any number of elements. However, the mean of a list of integers is not necessarily an integer. "The average family has 1.7 children" is a jarring way of making a statement that is more appropriately expressed by "the average number of children in the collection of families examined is 1.7."
+
<math>HM = \frac{1}{\frac{1}{n}\sum_{i=0}^n \frac{1}{a_i}}=\frac{n}{\frac{1}{a_1}+\frac{1}{a_2}+...+\frac{1}{a_n}}</math>
  
There are many other kinds of averages. However, they can all be understood in the same manner. For example, sometimes it is informative to consider the [[geometric mean]]. Here, instead of adding numbers we multiply them. Thus, the geometric mean of 2 and 8 is obtained by solving for G in the following equation: 2 * 8 = G * G. Thus, the geometric mean of 2 and 8 is G = sqrt(2 * 8) = 4. And again it is seen that changing the order of the members of the list to be averaged does not change the result: G = sqrt(8 * 2) = 4. In order to make sense of the requirement that the mean must be at least as big as the smallest member of the list and no bigger than the largest, the geometric mean is usually only applied to lists of positive numbers, not to lists that can include negative numbers such as temperatures.  
+
One example where it is useful is calculating the average speed. For example, if the speed for going from point A to B was 60km/h, and the speed for returning from B to A was 40km/h, then the average speed is given by <math>\frac{2}{1/60+1/40}=48</math>.
  
It should now be obvious that it would be easy to come up with many other ways of combining the elements of a list in a manner that does not change when the order of the list is changed. For each of them one can define an average based on that method.
+
===Inequality Concerning AM, GM & HM===
 +
A well known inequality concerning Arithmetic, Geometric, and Harmonic means for any set of positive numbers is
  
The most frequently occurring number in a list of numbers is called the [[mode (statistics)|mode]]. So the mode of the list (1, 2, 2, 3, 3, 3, 4) is 3. The mode is not necessarily well defined. The list (1, 2, 2, 3, 3, 5) has the two modes 2 and 3. The mode can be subsumed under the general method of defining averages by understanding it as taking the list and setting each member of the list equal to the most common value in the list if there is a most common value.  This list is then equated to a the resulting list with all values replaced by the same value.  Since they are already all the same, this does not require any change.
+
<math>AM \ge GM \ge HM</math>
  
Another average worth discussing is the [[median]]. Its method is to order the list according to its magnitude and then repeatedly remove the pair consisting of the highest and lowest value till either one or two values are left. If two values are left replace them with their arithmetic mean. This method takes the list 1, 7, 3, 13 and orders it to read 1, 3, 7, 13. Then the 1 and 13 are removed to obtain the list 3, 7. Since there are two elements in this list replace them by their arithmetic mean (3 + 7)/2 = 5. Now do the same for the equal sized list consisting of all the same value M: M, M, M, M. It is already ordered. We remove the two end values to get M, M. We take their arithmetic mean to get M. Finally, set this result equal to our previous result to get M = 5.
+
It is easy to remember noting that the alphabetical order of the letters A, G and H is preserved in the inequality.
  
In finance people are often interested in the annualized return which is a different kind of average. To begin with an example consider two years in which the return in the first year is minus 10% and the return in the second year is plus 60%. Then the annualized return, R, would be obtained by solving the equation: (1 - 10%) * (1 + 60%) = (1 + R) * (1 + R). The value of R that makes this equation true is R = 12%. It is again to be noted that changing the order to find the annualized return of 60% and -10% gives the same result as the annualized return of -10% and 60%. This method can be generalized to examples where the periods are not all of one-year duration. Annualization of a set of returns is a variation on the geometric average that provides the intensive property of a return per year corresponding to a list of returns.  Consider a function that adds one to each return in the list and then takes the T th root of their product, where T is the sum of the periods of all the returns. This function is set equal to the same function for a list with the same number of elements composed of identical single year returns, whose value is the annualized returnFor example, consider a period of a half of a year for which the return is minus 20% and a period of two and one half years for which the return is 116%.  The annualized return for the combined period is the single year return, R, that is the solution of the following equation:
+
===Mode and median===
{(1-20%)*(1+116%)}^{1/(0.5 + 2.5)} = {(1+R)*(1+R)}^{1/(1 + 1)},
+
{{main|Mode (statistics)}}
giving an annualized return, R, of 20%. 
+
The most frequently occurring number in a list of numbers is called the mode. The mode of the list (1, 2, 2, 3, 3, 3, 4) is 3. The mode is not necessarily well defined, the list (1, 2, 2, 3, 3, 5) has the two modes 2 and 3. The mode can be subsumed under the general method of defining averages by understanding it as taking the list and setting each member of the list equal to the most common value in the list if there is a most common value. This list is then equated to the resulting list with all values replaced by the same value.  Since they are already all the same, this does not require any change.
  
All averages can be thought of as examples of this general method for obtaining averages. A number of averages, including the ones discussed above, that have been found to be useful in some circumstance or other are listed below along with their formal solutions.
+
{{main|Median}}
 +
To find the median, order the list according to its elements' magnitude and then repeatedly remove the pair consisting of the highest and lowest values until either one or two values are left. If exactly one value is left, it is the median; if two values, the median is the arithmetic mean of these two. This method takes the list 1, 7, 3, 13 and orders it to read 1, 3, 7, 13. Then the 1 and 13 are removed to obtain the list 3, 7. Since there are two elements in this remaining list, the median is their arithmetic mean, (3 + 7)/2 = 5. Now do the same for the equal-sized list consisting of all the same value M: M, M, M, M. It is already ordered. We remove the two end values to get M, M. We take their arithmetic mean to get M. Finally, set this result equal to our previous result to get M = 5.
  
 +
===Annualized return===
 +
The annualized return is a type of average used in finance. For example, if there are two years in which the return in the first year is −10% and the return in the second year is +60%, then the annualized return, ''R'', can be obtained by solving the equation: {{nowrap|1= (1 − 10%) × (1 + 60%) = (1 − 0.1) × (1 + 0.6) = (1 + ''R'') × (1 + ''R'')}}. The value of ''R'' that makes this equation true is 0.2, or 20%. Note that changing the order to find the annualized returns of +60% and −10% gives the same result as the annualized returns of −10% and +60%.
 +
 +
This method can be generalized to examples in which the periods are not all of one-year duration. Annualization of a set of returns is a variation on the geometric average that provides the intensive property of a return per year corresponding to a list of returns. For example, consider a period of a half of a year for which the return is −23% and a period of two and one half years for which the return is +13%. The annualized return for the combined period is the single year return, ''R'', that is the solution of the following equation: {{nowrap|1= (1 − 0.23)<sup>0.5</sup> × (1 + 0.13)<sup>2.5</sup> = (1 + ''R'')<sup>0.5+2.5</sup>}}, giving an annualized return ''R'' of 0.0600 or 6.00%.
 +
 +
==Types==
 +
The [[table of mathematical symbols]] explains the symbols used below.
 
{|class="wikitable" style="background: white;"
 
{|class="wikitable" style="background: white;"
 
|-
 
|-
Line 45: Line 72:
 
| [[Harmonic mean]] || <math>\frac{n}{\frac{1}{x_1} + \frac{1}{x_2} + \cdots + \frac{1}{x_n}}</math>
 
| [[Harmonic mean]] || <math>\frac{n}{\frac{1}{x_1} + \frac{1}{x_2} + \cdots + \frac{1}{x_n}}</math>
 
|-
 
|-
| [[Quadratic mean]]<br/>(or RMS) || <math>\sqrt{\frac{1}{n} \sum_{i=1}^{n} x_i^2} =
+
| [[Quadratic mean]]<br>(or RMS) || <math>\sqrt{\frac{1}{n} \sum_{i=1}^{n} x_i^2} =
 
\sqrt {\frac{x_1^2 + x_2^2 + \cdots + x_n^2}{n}}
 
\sqrt {\frac{x_1^2 + x_2^2 + \cdots + x_n^2}{n}}
 
</math>
 
</math>
Line 61: Line 88:
 
| [[Winsorized mean]] ||  Similar to the truncated mean, but, rather than deleting the extreme values, they are set equal to the largest and smallest values that remain
 
| [[Winsorized mean]] ||  Similar to the truncated mean, but, rather than deleting the extreme values, they are set equal to the largest and smallest values that remain
 
|-
 
|-
| [[Annualization]] || -1 + {product (1+Rt)}^{1/sum(t))
+
| [[Compound annual growth rate|Annualization]] || <math>-1 + {\prod (1+Rt)}^{1/\sum t_i}</math>
 +
|}
 +
 
 +
==Solutions to variational problems==
 +
Several measures of central tendency can be characterized as solving a variational problem, in the sense of the [[calculus of variations]], namely minimizing variation from the center. That is, given a measure of [[statistical dispersion]], one asks for a measure of central tendency that minimizes variation: such that variation from the center is minimal among all choices of center. In a quip, "dispersion precedes location". In the sense of [[Lp space|<math>L^p</math> spaces]], the correspondence is:
 +
{| class="wikitable"
 +
! <math>L^p</math> !! dispersion !! central tendency
 +
|-
 +
! <math>L^1</math>
 +
| [[average absolute deviation]]
 +
| [[median]]
 +
|-
 +
! <math>L^2</math>
 +
| [[standard deviation]]
 +
| [[mean]]
 +
|-
 +
! <math>L^\infty</math>
 +
| [[maximum deviation]]
 +
| [[midrange]]
 
|}
 
|}
  
==Other averages==  
+
Thus standard deviation about the mean is lower than standard deviation about any other point; the uniqueness of this characterization of mean and midrange follows from [[convex optimization]], as the <math>L^2</math> and <math>L^\infty</math> norms are [[convex functions]]. Note that the median in this sense is not in general unique, and in fact any point between the two central points of a discrete distribution minimizes average absolute deviation.
 +
 
 +
Similarly, the [[Mode (statistics)|mode]] minimizes [[qualitative variation]].{{Fact|date=March 2008}}
 +
 
 +
==Miscellaneous types==  
  
 
Other more sophisticated averages are: [[trimean]], [[trimedian]], and [[normalized mean]]. These are usually more representative of the whole data set. {{Fact|date=February 2007}}
 
Other more sophisticated averages are: [[trimean]], [[trimedian]], and [[normalized mean]]. These are usually more representative of the whole data set. {{Fact|date=February 2007}}
Line 74: Line 123:
 
where ''f'' is any invertible function. The harmonic mean is an example of this using ''f''(''x'') = 1/''x'', and the geometric mean is another, using ''f''(''x'') = log&nbsp;''x''. Another example, expmean (exponential mean) is a mean using the function ''f''(''x'') = ''e''<sup>''x''</sup>, and it is inherently biased towards the higher values.  However, this method for generating means is not general enough to capture all averages.  A more general method for defining an average, y, takes any function of a list g(x<sub>1</sub>, x<sub>2</sub>, ..., x<sub>n</sub>), which is symmetric under permutation of the members of the list, and equates it to the same function with the value of the average replacing each member of the list: g(x<sub>1</sub>, x<sub>2</sub>, ..., x<sub>n</sub>) = g(y, y, ..., y).  This most general definition still captures the important property of all averages that the average of a list of identical elements is that element itself.  
 
where ''f'' is any invertible function. The harmonic mean is an example of this using ''f''(''x'') = 1/''x'', and the geometric mean is another, using ''f''(''x'') = log&nbsp;''x''. Another example, expmean (exponential mean) is a mean using the function ''f''(''x'') = ''e''<sup>''x''</sup>, and it is inherently biased towards the higher values.  However, this method for generating means is not general enough to capture all averages.  A more general method for defining an average, y, takes any function of a list g(x<sub>1</sub>, x<sub>2</sub>, ..., x<sub>n</sub>), which is symmetric under permutation of the members of the list, and equates it to the same function with the value of the average replacing each member of the list: g(x<sub>1</sub>, x<sub>2</sub>, ..., x<sub>n</sub>) = g(y, y, ..., y).  This most general definition still captures the important property of all averages that the average of a list of identical elements is that element itself.  
 
The function g(x<sub>1</sub>, x<sub>2</sub>, ..., x<sub>n</sub>) =x<sub>1</sub>+x<sub>2</sub>+ ...+ x<sub>n</sub> provides the arithmetic mean.  
 
The function g(x<sub>1</sub>, x<sub>2</sub>, ..., x<sub>n</sub>) =x<sub>1</sub>+x<sub>2</sub>+ ...+ x<sub>n</sub> provides the arithmetic mean.  
The function g(x<sub>1</sub>, x<sub>2</sub>, ..., x<sub>n</sub>) =x<sub>1</sub>&middot;x<sub>2</sub>&middot; ...&middot; x<sub>n</sub> provides the geometric mean.  
+
The function g(x<sub>1</sub>, x<sub>2</sub>, ..., x<sub>n</sub>) =x<sub>1</sub>·x<sub>2</sub>· ...· x<sub>n</sub> provides the geometric mean.  
The function g(x<sub>1</sub>, x<sub>2</sub>, ..., x<sub>n</sub>) =x<sub>1</sub><sup>&minus;1</sup>+x<sub>2</sub><sup>&minus;1</sup>+ ...+ x<sub>n</sub><sup>&minus;1</sup> provides the harmonic mean.
+
The function g(x<sub>1</sub>, x<sub>2</sub>, ..., x<sub>n</sub>) =x<sub>1</sub><sup>&minus;1</sup>+x<sub>2</sub><sup>&minus;1</sup>+ ...+ x<sub>n</sub><sup>&minus;1</sup> provides the harmonic mean. (See  John Bibby (1974) “Axiomatisations of the average and a further generalisation of monotonic sequences,” Glasgow Mathematical Journal, vol. 15, pp. 63–65.)
 
 
==Average applied to a data stream==
 
  
 +
==In data streams==
 
The concept of an average can be applied to a stream of data as well as a bounded set, the goal being to find a value about which recent data is in some way clustered.  The stream may be distributed in time, as in samples taken by some data acquisition system from which we want to remove noise, or in space, as in pixels in an image from which we want to extract some property.  An easy-to-understand and widely used application of average to a stream is the simple [[moving average]] in which we compute the arithmetic mean of the most recent N data items in the stream.  To advance one position in the stream, we add 1/N times the new data item and subtract 1/N times the data item N places back in the stream.
 
The concept of an average can be applied to a stream of data as well as a bounded set, the goal being to find a value about which recent data is in some way clustered.  The stream may be distributed in time, as in samples taken by some data acquisition system from which we want to remove noise, or in space, as in pixels in an image from which we want to extract some property.  An easy-to-understand and widely used application of average to a stream is the simple [[moving average]] in which we compute the arithmetic mean of the most recent N data items in the stream.  To advance one position in the stream, we add 1/N times the new data item and subtract 1/N times the data item N places back in the stream.
  
==Derivation of the name==
+
==Etymology==
  
 
The original meaning of the word ''average'' is "damage sustained at sea": the same word is found in Arabic as ''awar'', in Italian as ''avaria'' and in French as ''avarie''.  Hence an ''average adjuster'' is a person who assesses an insurable loss.
 
The original meaning of the word ''average'' is "damage sustained at sea": the same word is found in Arabic as ''awar'', in Italian as ''avaria'' and in French as ''avarie''.  Hence an ''average adjuster'' is a person who assesses an insurable loss.
  
Marine damage is either ''particular average'', which is borne only by the owner of the damaged property, or [[general average]], where the owner can claim a proportional contribution from all the parties to the marine venture.  The type of calculations used in adjusting general average gave rise to the use of "average" to mean "arithmetic mean."
+
Marine damage is either ''particular average'', which is borne only by the owner of the damaged property, or [[general average]], where the owner can claim a proportional contribution from all the parties to the marine venture.  The type of calculations used in adjusting general average gave rise to the use of "average" to mean "arithmetic mean".
 +
 
 +
==Footnotes==
 +
<references />
 +
 
 +
==References==
 +
*{{citation|first1=G.H.|last1=Hardy|authorlink1=G.H. Hardy|first2=J.E.|last2=Littlewood|authorlink2=John Edensor Littlewood|first3=G.|last3=Pólya|authorlink3=George Pólya|title=Inequalities|year=1988|publisher=Cambridge University Press|edition=2nd|isbn=978-0521358804}}
 +
 
 +
==See also==
 +
* [[Statistics]]
  
 
==External links==
 
==External links==
 
{{Wiktionary}}
 
{{Wiktionary}}
*[http://economicsbulletin.vanderbilt.edu/2004/volume3/EB-04C10011A.pdf Median as a weighted arithmetic mean of all Sample Observations] - Retrieved December 13, 2007.
+
*[http://economicsbulletin.vanderbilt.edu/2004/volume3/EB-04C10011A.pdf Median as a weighted arithmetic mean of all Sample Observations]
*[http://www.sengpielaudio.com/calculator-geommean.htm Calculations and comparison between arithmetic and geometric mean of two values] - Retrieved December 13, 2007.
+
*[http://www.sengpielaudio.com/calculator-geommean.htm Calculations and comparison between arithmetic and geometric mean of two values]
 +
 
 +
[[Category:Summary statistics]]
 +
[[Category:Means]]
 +
[[Category:Statistical terminology]]
  
[[Category:Physical sciences]]
+
[[cs:Míra polohy]]
[[Category:Mathematics]]
+
[[de:Mittelwert]]
{{credits|177408396}}
+
[[es:Promedio]]
 +
[[eo:Centra dispozicio]]
 +
[[fr:Moyenne]]
 +
[[it:Media (statistica)]]
 +
[[nl:Gemiddelde]]
 +
[[ja:平均]]
 +
[[no:Gjennomsnitt]]
 +
[[pl:Średnia]]
 +
[[pt:Média]]
 +
[[sk:Priemer (štatistika)]]
 +
[[sl:Srednja vrednost]]
 +
[[fi:Keskiluku]]
 +
[[th:แนวโน้มสู่ส่วนกลาง]]
 +
[[tr:Ortalama]]
 +
[[wuu:平均]]

Revision as of 00:43, 9 November 2008

In mathematics, an average, or central tendency [1] of a data set refers to a measure of the "middle" or "expected" value of the data set. There are many different descriptive statistics that can be chosen as a measurement of the central tendency of the data items.

An average is a single value that is meant to typify a list of values. If all the numbers in the list are the same, then this number should be used. If the numbers are not all the same, an easy way to get a representative value from a list is to randomly pick any number from the list. However, the word 'average' is usually reserved for more sophisticated methods that are generally found to be more useful.

The most common method is the arithmetic mean. There are many other types of averages, such as median (used most often to describe house prices and incomes). [2]The average is calculated by combining the measurements related to a set and to compute a number as being the average of the set.

Calculation

Arithmetic mean

Simply put, if numbers are given, each number denoted by ai, where , the arithmetic mean is the sum of the ai's divided by or

.

The arithmetic mean, often simply called the mean, of two numbers, such as 2 and 8, is obtained by finding a value A such that 2 + 8 = A + A. It is then simple to find that A = (2 + 8)/2 = 5. Switching the order of 2 and 8 to read 8 and 2 does not change the resulting value obtained for A. The mean 5 is not less than the minimum 2 nor greater than the maximum 8. If we increase the number of terms in the list for which we want an average, we get, for example, that the arithmetic mean of 2, 8, and 11 is found by solving for the value of A in the equation 2 + 8 + 11 = A + A + A. It is simple to find that A = (2 + 8 + 11)/3 = 7.

Again, changing the order of the three members of the list does not change the result: A = (8 + 11 + 2)/3 = 7, and that 7 is between 2 and 11. This summation method is easily generalized for lists with any number of elements. However, the mean of a list of integers is not necessarily an integer. "The average family has 1.7 children" is a jarring way of making a statement that is more appropriately expressed by "the average number of children in the collection of families examined is 1.7".

Geometric mean

Geometric mean of is defined as

Geometric mean can be thought of as the antilog of the arithmetic mean of the logs of the numbers.

Example: Geometric mean of 2 and 8 is .

Harmonic mean

Harmonic mean for a set of numbers is defined as the reciprocal of the arithmetic mean of the reciprocals of 's:

One example where it is useful is calculating the average speed. For example, if the speed for going from point A to B was 60km/h, and the speed for returning from B to A was 40km/h, then the average speed is given by .

Inequality Concerning AM, GM & HM

A well known inequality concerning Arithmetic, Geometric, and Harmonic means for any set of positive numbers is

It is easy to remember noting that the alphabetical order of the letters A, G and H is preserved in the inequality.

Mode and median

The most frequently occurring number in a list of numbers is called the mode. The mode of the list (1, 2, 2, 3, 3, 3, 4) is 3. The mode is not necessarily well defined, the list (1, 2, 2, 3, 3, 5) has the two modes 2 and 3. The mode can be subsumed under the general method of defining averages by understanding it as taking the list and setting each member of the list equal to the most common value in the list if there is a most common value. This list is then equated to the resulting list with all values replaced by the same value. Since they are already all the same, this does not require any change.


To find the median, order the list according to its elements' magnitude and then repeatedly remove the pair consisting of the highest and lowest values until either one or two values are left. If exactly one value is left, it is the median; if two values, the median is the arithmetic mean of these two. This method takes the list 1, 7, 3, 13 and orders it to read 1, 3, 7, 13. Then the 1 and 13 are removed to obtain the list 3, 7. Since there are two elements in this remaining list, the median is their arithmetic mean, (3 + 7)/2 = 5. Now do the same for the equal-sized list consisting of all the same value M: M, M, M, M. It is already ordered. We remove the two end values to get M, M. We take their arithmetic mean to get M. Finally, set this result equal to our previous result to get M = 5.

Annualized return

The annualized return is a type of average used in finance. For example, if there are two years in which the return in the first year is −10% and the return in the second year is +60%, then the annualized return, R, can be obtained by solving the equation: (1 − 10%) × (1 + 60%) = (1 − 0.1) × (1 + 0.6) = (1 + R) × (1 + R). The value of R that makes this equation true is 0.2, or 20%. Note that changing the order to find the annualized returns of +60% and −10% gives the same result as the annualized returns of −10% and +60%.

This method can be generalized to examples in which the periods are not all of one-year duration. Annualization of a set of returns is a variation on the geometric average that provides the intensive property of a return per year corresponding to a list of returns. For example, consider a period of a half of a year for which the return is −23% and a period of two and one half years for which the return is +13%. The annualized return for the combined period is the single year return, R, that is the solution of the following equation: (1 − 0.23)0.5 × (1 + 0.13)2.5 = (1 + R)0.5+2.5, giving an annualized return R of 0.0600 or 6.00%.

Types

The table of mathematical symbols explains the symbols used below.

Name Equation or description
Arithmetic mean
Median The middle value that separates the higher half from the lower half of the data set
Geometric median A rotation invariant extension of the median for points in Rn
Mode The most frequent value in the data set
Geometric mean
Harmonic mean
Quadratic mean
(or RMS)
Generalized mean
Weighted mean
Truncated mean The arithmetic mean of data values after a certain number or proportion of the highest and lowest data values have been discarded
Interquartile mean A special case of the truncated mean, using the interquartile range
Midrange
Winsorized mean Similar to the truncated mean, but, rather than deleting the extreme values, they are set equal to the largest and smallest values that remain
Annualization

Solutions to variational problems

Several measures of central tendency can be characterized as solving a variational problem, in the sense of the calculus of variations, namely minimizing variation from the center. That is, given a measure of statistical dispersion, one asks for a measure of central tendency that minimizes variation: such that variation from the center is minimal among all choices of center. In a quip, "dispersion precedes location". In the sense of spaces, the correspondence is:

dispersion central tendency
average absolute deviation median
standard deviation mean
maximum deviation midrange

Thus standard deviation about the mean is lower than standard deviation about any other point; the uniqueness of this characterization of mean and midrange follows from convex optimization, as the and norms are convex functions. Note that the median in this sense is not in general unique, and in fact any point between the two central points of a discrete distribution minimizes average absolute deviation.

Similarly, the mode minimizes qualitative variation.[citation needed]

Miscellaneous types

Other more sophisticated averages are: trimean, trimedian, and normalized mean. These are usually more representative of the whole data set. [citation needed]

One can create one's own average metric using generalized f-mean:

where f is any invertible function. The harmonic mean is an example of this using f(x) = 1/x, and the geometric mean is another, using f(x) = log x. Another example, expmean (exponential mean) is a mean using the function f(x) = ex, and it is inherently biased towards the higher values. However, this method for generating means is not general enough to capture all averages. A more general method for defining an average, y, takes any function of a list g(x1, x2, ..., xn), which is symmetric under permutation of the members of the list, and equates it to the same function with the value of the average replacing each member of the list: g(x1, x2, ..., xn) = g(y, y, ..., y). This most general definition still captures the important property of all averages that the average of a list of identical elements is that element itself. The function g(x1, x2, ..., xn) =x1+x2+ ...+ xn provides the arithmetic mean. The function g(x1, x2, ..., xn) =x1·x2· ...· xn provides the geometric mean. The function g(x1, x2, ..., xn) =x1−1+x2−1+ ...+ xn−1 provides the harmonic mean. (See John Bibby (1974) “Axiomatisations of the average and a further generalisation of monotonic sequences,” Glasgow Mathematical Journal, vol. 15, pp. 63–65.)

In data streams

The concept of an average can be applied to a stream of data as well as a bounded set, the goal being to find a value about which recent data is in some way clustered. The stream may be distributed in time, as in samples taken by some data acquisition system from which we want to remove noise, or in space, as in pixels in an image from which we want to extract some property. An easy-to-understand and widely used application of average to a stream is the simple moving average in which we compute the arithmetic mean of the most recent N data items in the stream. To advance one position in the stream, we add 1/N times the new data item and subtract 1/N times the data item N places back in the stream.

Etymology

The original meaning of the word average is "damage sustained at sea": the same word is found in Arabic as awar, in Italian as avaria and in French as avarie. Hence an average adjuster is a person who assesses an insurable loss.

Marine damage is either particular average, which is borne only by the owner of the damaged property, or general average, where the owner can claim a proportional contribution from all the parties to the marine venture. The type of calculations used in adjusting general average gave rise to the use of "average" to mean "arithmetic mean".

Footnotes

  1. In statistics, the term central tendency is used in some fields of empirical research to refer to what statisticians sometimes call "location".
  2. An axiomatic approach to averages is provided by John Bibby (1974) “Axiomatisations of the average and a further generalization of monotonic sequences,” Glasgow Mathematical Journal, vol. 15, pp. 63–65.

References
ISBN links support NWE through referral fees

  • Hardy, G.H.; J.E. Littlewood & G. Pólya (1988), Inequalities (2nd ed.), Cambridge University Press, ISBN 978-0521358804 

See also

External links

cs:Míra polohy de:Mittelwert es:Promedio eo:Centra dispozicio fr:Moyenne it:Media (statistica) nl:Gemiddelde ja:平均 no:Gjennomsnitt pl:Średnia pt:Média sk:Priemer (štatistika) sl:Srednja vrednost fi:Keskiluku th:แนวโน้มสู่ส่วนกลาง tr:Ortalama wuu:平均