resistics.regression.robust module¶
The source for these functions is Robust Statisitics, Huber, 2009 in general, linear regression is# have observations y and predictors A y is multiple observations/response x are the independent variables and is unknown and y is a linear function of x => y = Ax y = nobs A = nobs * nregressors x = nregressors
-
resistics.regression.robust.
andrewsWaveLocationWeights
(r: numpy.ndarray, k: float) → numpy.ndarray[source]¶ Andrews Wave location weights
- Parameters
- rnp.ndarray
Residuals
- kfloat
Tuning parameter
- Returns
- weightsnp.ndarray
The robust weights
-
resistics.regression.robust.
bisquareLocationWeights
(r: numpy.ndarray, k: float) → numpy.ndarray[source]¶ Bisquare location weights
- Parameters
- rnp.ndarray
Residuals
- kfloat
Tuning parameter
- Returns
- weightsnp.ndarray
The robust weights
-
resistics.regression.robust.
chatterjeeMachler
(A: numpy.ndarray, y: numpy.ndarray, **kwargs) → Tuple[source]¶ Robust bounded influence solver
Solves for \(x\) where,
\[y = Ax .\]Being a bounded influence operator, should be robust against both outliers in dependent and independent variables.
- Parameters
- Anp.ndarray
Predictors, size nobs*nregressors
- ynp.ndarray
Observations, size nobs
- interceptbool, optional
True or False for adding an intercept term
- Returns
- paramsnp.ndarray
Values in x
- residsnp.ndarray
Residuals = y - Ax
- weightsnp.ndarray
Weights used in robust regression
-
resistics.regression.robust.
chatterjeeMachlerHadi
(X, y, **kwargs)[source]¶ Regression based on Hadi distances
# Another regression method based on Hadi distances # implemented from the paper A Re-Weighted Least Squares Method for Robust Regression Estimation # Billor, Hadi
-
resistics.regression.robust.
defaultDictionary
() → Dict[source]¶ Robust regression defaults
- Returns
- Dict
Default regression options
-
resistics.regression.robust.
eps
() → float[source]¶ Small number
- Returns
- float
A small number for quitting robust regression
-
resistics.regression.robust.
getRobustLocationWeights
(r: numpy.ndarray, weight: str) → numpy.ndarray[source]¶ Robust weighting schemes
- Parameters
- rnp.ndarray
Residuals
- weightstr
The type of weighting to use
- Returns
- weightsnp.ndarray
The robust weights
-
resistics.regression.robust.
hampelLocationWeights
(r: numpy.ndarray, k: float) → numpy.ndarray[source]¶ Hampel location weights
- Parameters
- rnp.ndarray
Residuals
- kfloat
Tuning parameter
- Returns
- weightsnp.ndarray
The robust weights
-
resistics.regression.robust.
hermitianTranspose
(mat: numpy.ndarray) → numpy.ndarray[source]¶ Hermitian transpose (transpose and complex conjugation)
- Parameters
- np.ndarray
Vector, matrix to Hermitian transpose
- Returns
- np.ndarray
Hermitian transpose
-
resistics.regression.robust.
huberLocationWeights
(r: numpy.ndarray, k: float) → numpy.ndarray[source]¶ Huber location weights
- Parameters
- rnp.ndarray
Residuals
- kfloat
Tuning parameter
- Returns
- weightsnp.ndarray
The robust weights
-
resistics.regression.robust.
initialFromDict
(initDict: Dict) → Tuple[source]¶ Returns initial model from provided initial model dictionary
Helps for two stage robust regression.
- Parameters
- Dict
Initial model to use for robust regression with the parameters, residuals and scale estimate
- Returns
- parametersnp.ndarray
- residsnp.ndarray
The residuals
- scalefloat
Initial estimate of scale
-
resistics.regression.robust.
leastSquaresLocationWeights
(r: numpy.ndarray)[source]¶ Least squares weights, which are all equal to 1
- Parameters
- rnp.ndarray
Residuals
- Returns
- weightsnp.ndarray
The robust weights
-
resistics.regression.robust.
maxIter
() → int[source]¶ Maximum number of iterations
- Returns
- int
The maximum number of iterations
-
resistics.regression.robust.
mestimateModel
(A: numpy.ndarray, y: numpy.ndarray, **kwargs) → Tuple[source]¶ Mestimate robust least squares
Solves for \(x\) where,
\[y = Ax .\]Good method for dependent outliers (in \(y\)). Not robust against independent outliers (leverage points)
- Parameters
- Anp.ndarray
Predictors, size nobs*nregressors
- ynp.ndarray
Observations, size nobs
- initial :
- scaleoptional
A scale estimate
- interceptbool, optional
True or False for adding an intercept term
- Returns
- paramsnp.ndarray
Values in x
- residsnp.ndarray
Residuals = y - Ax
- scalefloat
Robust measure of variance
- weightsnp.ndarray
Weights used in robust regression
-
resistics.regression.robust.
mmestimateModel
(A: numpy.ndarray, y: numpy.ndarray, **kwargs)[source]¶ 2 stage M estimate
Solves for \(x\) where,
\[y = Ax .\]- Parameters
- Anp.ndarray
Predictors, size nobs*nregressors
- ynp.ndarray
Observations, size nobs
- initialDict
Initial solution with parameters, scale and residuals
- scaleoptional
A scale estimate
- interceptbool, optional
True or False for adding an intercept term
- Returns
- paramsnp.ndarray
Values in x
- residsnp.ndarray
Residuals = y - Ax
- scalefloat
Robust measure of variance
- weightsnp.ndarray
Weights used in robust regression
-
resistics.regression.robust.
olsModel
(A, y, **kwargs) → Tuple[source]¶ Ordinary least squares
Solves for \(x\) where,
\[y = Ax .\]- Parameters
- Anp.ndarray
Predictors, size nobs*nregressors
- ynp.ndarray
Observations, size nobs
- interceptbool, optional
True or False for adding an intercept term
- Returns
- paramsnp.ndarray
Least squares solution
- residsnp.ndarray
Residuals
- squareResidnp.ndarray
Square residuals
- rankint
Rank of matrix A
- snp.ndarray
Singular values of A
-
resistics.regression.robust.
sampleMAD
(data)[source]¶ Median absolute deviation
The standard deviation is not robust against outliers, hence use the MAD.
- Parameters
- np.ndarray
Data for which to calculate MAD
- Returns
- float
The MAD
-
resistics.regression.robust.
sampleMAD0
(data)[source]¶ Median absolute deviation using an estimate of the location as 0
When the location estimate is zero (rather than the median), the MAD essentially reduces to a median. This should be over non zero data. Useful for calculating variance of residuals.
- Parameters
- np.ndarray
Data for which to calculate MAD. This is often residuals when using 0 as an estimate of location.
- Returns
- float
The MAD using zero as an esimate of location
-
resistics.regression.robust.
sampleMedian
(data)[source]¶ Calculate the median of an array
Mean is not a robust estimator of locations as it can be broken by a single outlying value. The median is a more robust choice.
- Parameters
- np.ndarray
Data for which to calculate median
- Returns
- float
The median
-
resistics.regression.robust.
trimmedMeanLocationWeights
(r: numpy.ndarray, k: float) → numpy.ndarray[source]¶ Trimmed mean location weights
- Parameters
- rnp.ndarray
Residuals
- kfloat
Tuning parameter
- Returns
- weightsnp.ndarray
The robust weights
-
resistics.regression.robust.
weightLS
(A: numpy.ndarray, y: numpy.ndarray, weights: numpy.ndarray) → Tuple[numpy.ndarray][source]¶ Transform A and y using the weights to perform a weighted least squares
\[\sqrt{weights} y = \sqrt{weights} A x ,\]is equivalent to,
\[A^H weights y = A^H weights A x ,\]where \(A^H\) is the hermitian transpose.
In this method, both y and A are multipled by the square root of the weights and then returned.
- Parameters
- ynp.ndarray
Observations
- Anp.ndarray
Regressors
- Returns
- ynp.ndarray
Observations multipled by the square root of the weights
- Anp.ndarray
Regressors multipled by the square root of the weights