|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.apache.commons.math.stat.descriptive.AbstractUnivariateStatistic
org.apache.commons.math.stat.descriptive.rank.Percentile
public class Percentile
Provides percentile computation.
There are several commonly used methods for estimating percentiles (a.k.a. quantiles) based on sample data. For large samples, the different methods agree closely, but when sample sizes are small, different methods will give significantly different results. The algorithm implemented here works as follows:
n be the length of the (sorted) array and
0 < p <= 100 be the desired percentile. n = 1 return the unique array element (regardless of
the value of p); otherwise pos = p * (n + 1) / 100 and the difference, d
between pos and floor(pos) (i.e. the fractional
part of pos). If pos >= n return the largest
element in the array; otherwiselower be the element in position
floor(pos) in the array and let upper be the
next element in the array. Return lower + d * (upper - lower)
To compute percentiles, the data must be at least partially ordered. Input
arrays are copied and recursively partitioned using an ordering definition.
The ordering used by Arrays.sort(double[]) is the one determined
by Double.compareTo(Double). This ordering makes
Double.NaN larger than any other value (including
Double.POSITIVE_INFINITY). Therefore, for example, the median
(50th percentile) of
{0, 1, 2, 3, 4, Double.NaN} evaluates to 2.5.
Since percentile estimation usually involves interpolation between array
elements, arrays containing NaN or infinite values will often
result in NaN or infinite values returned.
Since 2.2, Percentile implementation uses only selection instead of complete
sorting and caches selection algorithm state between calls to the various
evaluate methods when several percentiles are to be computed on the same data.
This greatly improves efficiency, both for single percentile and multiple
percentiles computations. However, it also induces a need to be sure the data
at one call to evaluate is the same as the data with the cached algorithm
state from the previous calls. Percentile does this by checking the array reference
itself and a checksum of its content by default. If the user already knows he calls
evaluate on an immutable array, he can save the checking time by calling the
evaluate methods that do not
Note that this implementation is not synchronized. If
multiple threads access an instance of this class concurrently, and at least
one of the threads invokes the increment() or
clear() method, it must be synchronized externally.
| Field Summary | |
|---|---|
private int[] |
cachedPivots
Cached pivots. |
private static int |
MAX_CACHED_LEVELS
Maximum number of partitioning pivots cached (each level double the number of pivots). |
private static int |
MIN_SELECT_SIZE
Minimum size under which we use a simple insertion sort rather than Hoare's select. |
private double |
quantile
Determines what percentile is computed when evaluate() is activated with no quantile argument |
private static long |
serialVersionUID
Serializable version identifier |
| Constructor Summary | |
|---|---|
Percentile()
Constructs a Percentile with a default quantile value of 50.0. |
|
Percentile(double p)
Constructs a Percentile with the specific quantile value. |
|
Percentile(Percentile original)
Copy constructor, creates a new Percentile identical
to the original |
|
| Method Summary | |
|---|---|
Percentile |
copy()
Returns a copy of the statistic with the same internal state. |
static void |
copy(Percentile source,
Percentile dest)
Copies source to dest. |
double |
evaluate(double p)
Returns the result of evaluating the statistic over the stored data. |
double |
evaluate(double[] values,
double p)
Returns an estimate of the pth percentile of the values
in the values array. |
double |
evaluate(double[] values,
int start,
int length)
Returns an estimate of the quantileth percentile of the
designated values in the values array. |
double |
evaluate(double[] values,
int begin,
int length,
double p)
Returns an estimate of the pth percentile of the values
in the values array, starting with the element in (0-based)
position begin in the array and including length
values. |
double |
getQuantile()
Returns the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument). |
private void |
insertionSort(double[] work,
int begin,
int end)
Sort in place a (small) array slice using insertion sort |
(package private) int |
medianOf3(double[] work,
int begin,
int end)
Select a pivot index as the median of three |
private int |
partition(double[] work,
int begin,
int end,
int pivot)
Partition an array slice around a pivot |
private double |
select(double[] work,
int[] pivotsHeap,
int k)
Select the kth smallest element from work array |
void |
setData(double[] values)
Set the data array. |
void |
setData(double[] values,
int begin,
int length)
Set the data array. |
void |
setQuantile(double p)
Sets the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument). |
| Methods inherited from class org.apache.commons.math.stat.descriptive.AbstractUnivariateStatistic |
|---|
evaluate, evaluate, getData, getDataRef, test, test |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
private static final long serialVersionUID
private static final int MIN_SELECT_SIZE
private static final int MAX_CACHED_LEVELS
private double quantile
private int[] cachedPivots
| Constructor Detail |
|---|
public Percentile()
public Percentile(double p)
p - the quantile
java.lang.IllegalArgumentException - if p is not greater than 0 and less
than or equal to 100public Percentile(Percentile original)
Percentile identical
to the original
original - the Percentile instance to copy| Method Detail |
|---|
public void setData(double[] values)
The stored value is a copy of the parameter array, not the array itself
setData in class AbstractUnivariateStatisticvalues - data array to store (may be null to remove stored data)AbstractUnivariateStatistic.evaluate()
public void setData(double[] values,
int begin,
int length)
setData in class AbstractUnivariateStatisticvalues - data array to storebegin - the index of the first element to includelength - the number of elements to includeAbstractUnivariateStatistic.evaluate()public double evaluate(double p)
The stored array is the one which was set by previous calls to
p - the percentile value to compute
public double evaluate(double[] values,
double p)
pth percentile of the values
in the values array.
Calls to this method do not modify the internal quantile
state of this statistic.
Double.NaN if values has length
0p) values[0]
if values has length 1IllegalArgumentException if values
is null or p is not a valid quantile value (p must be greater than 0
and less than or equal to 100)
See Percentile for a description of the percentile estimation
algorithm used.
values - input array of valuesp - the percentile value to compute
java.lang.IllegalArgumentException - if values is null
or p is invalid
public double evaluate(double[] values,
int start,
int length)
quantileth percentile of the
designated values in the values array. The quantile
estimated is determined by the quantile property.
Double.NaN if length = 0quantile)
values[begin] if length = 1 IllegalArgumentException if values
is null, or start or length
is invalid
See Percentile for a description of the percentile estimation
algorithm used.
evaluate in interface UnivariateStatisticevaluate in class AbstractUnivariateStatisticvalues - the input arraystart - index of the first array element to includelength - the number of elements to include
java.lang.IllegalArgumentException - if the parameters are not valid
public double evaluate(double[] values,
int begin,
int length,
double p)
pth percentile of the values
in the values array, starting with the element in (0-based)
position begin in the array and including length
values.
Calls to this method do not modify the internal quantile
state of this statistic.
Double.NaN if length = 0p) values[begin]
if length = 1 IllegalArgumentException if values
is null , begin or length is invalid, or
p is not a valid quantile value (p must be greater than 0
and less than or equal to 100)
See Percentile for a description of the percentile estimation
algorithm used.
values - array of input valuesp - the percentile to computebegin - the first (0-based) element to include in the computationlength - the number of array elements to include
java.lang.IllegalArgumentException - if the parameters are not valid or the
input array is null
private double select(double[] work,
int[] pivotsHeap,
int k)
work - work array (will be reorganized during the call)pivotsHeap - set of pivot index corresponding to elements that
are already at their sorted location, stored as an implicit heap
(i.e. a sorted binary tree stored in a flat array, where the
children of a node at index n are at indices 2n+1 for the left
child and 2n+2 for the right child, with 0-based indices)k - index of the desired element
int medianOf3(double[] work,
int begin,
int end)
work - data arraybegin - index of the first element of the sliceend - index after the last element of the slice
private int partition(double[] work,
int begin,
int end,
int pivot)
Partitioning exchanges array elements such that all elements smaller than pivot are before it and all elements larger than pivot are after it
work - data arraybegin - index of the first element of the sliceend - index after the last element of the slicepivot - initial index of the pivot
private void insertionSort(double[] work,
int begin,
int end)
work - array to sortbegin - index of the first element of the slice to sortend - index after the last element of the slice to sortpublic double getQuantile()
public void setQuantile(double p)
p - a value between 0 < p <= 100
java.lang.IllegalArgumentException - if p is not greater than 0 and less
than or equal to 100public Percentile copy()
copy in interface UnivariateStatisticcopy in class AbstractUnivariateStatistic
public static void copy(Percentile source,
Percentile dest)
Neither source nor dest can be null.
source - Percentile to copydest - Percentile to copy to
java.lang.NullPointerException - if either source or dest is null
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||