{smcl} {* *! version 1.0 3mar2011}{...} {cmd:help oaoutlier} {hline} {title:Title} {p2colset 5 20 22 2}{...} {phang} {bf:oaoutlier} {hline 2} Outlier detection based on order-alpha efficiency analysis{p_end} {p2colreset}{...} {title:Syntax} {p 8 17 2} {cmd:oaoutlier} {it:{help varname:varname}} {ifin}, {opt inp:uts}({it:{help varlist:varlist1}}) {opt out:puts}({it:{help varlist:varlist2}}) [{cmd:}{it:{help oaoutlier##options:options}}] {synoptset 28 tabbed}{...} {marker technology_definition}{...} {synopthdr :technology_definition} {synoptline} {syntab :Model} {synopt :{it:{help varname:varname}}}identifier{p_end} {synopt :{opth in:puts(varlist1)}}list of input variables{p_end} {synopt :{opth out:puts(varlist2)}}list of output variables{p_end} {synoptset 28 tabbed}{...} {synopthdr :options} {synoptline} {syntab :Main} {synopt :{opt {ul on}ort{ul off}(input|output)}}consider{it:input} or {it:output} oriented efficiency; default is {opt ort(input)}{p_end} {synopt :{opt nal:pha(#)}}try {it:#} values for alpha; the maximum allowed value is {it:N} which is also the default{p_end} {syntab :Detection} {synopt :{opt no{ul on}bic{ul off}}}do not suggest discontinuities based on BIC{p_end} {synopt :{opt no{ul on}rou{ul off}gh}}do not suggest discontinuities based on rough series of difference in differences{p_end} {synopt :{opt no{ul on}smo{ul off}oth}}do not suggest discontinuities based on smoothed series of differences in differences{p_end} {synopt :{opt {ul on}smoother{ul off}(string)}}use smoother {it:string} for smoothing series of differences in differences{p_end} {syntab :Reporting} {synopt :{opt no{ul on}plo{ul off}t}}suppress plotting series of share of super-efficient dmus{p_end} {synopt :{opt dot:s}}display loop dots{p_end} {synoptline} {p2colreset}{...} {p 4 6 2} {opt weights} are not allowed; see {help weight}.{p_end} {p 4 6 2}{cmd:bootstrap}, {cmd:by}, and {cmd:svy} are not allowed; see {help prefix}.{p_end} {title:Description} {pstd} {cmd:oaoutlier} is an explorative tool for detecting potential outliers in data meant for non-parametric efficiency analysis, e.g. {help dea:DEA}. {cmd:oaoutlier} applies the approach suggested by Daraio and Simar (2007). A a series of {help orderalpha:order-alpha} efficiency analyses using increasing values for the benchmark percentile alpha is run on the data. Subsequently, it is examined how the share of super-efficient dmus develops for increasing values of alpha. A smooth decrease of the share points at the absence of outliers. A discontinuity in the series, leading to a sharp decrease of the share, however, points at outliers being present. Specifically, those dmus being classified as super-efficient even for values of alpha higher than the one at the point of discontinuity, are most likely outliers. {cmd:oaoutlier} plots the series and tries three rules for detecting points of discontinuity. Two are local rules, based on the series of differences in differences as a non-parametric estimate of the curvature of original series. The first looks for the minimum values of the twice differenced series that follow a non-negative value. The second looks for negative values that persist after repeatedly smoothing the series of differences in differences by running odd-spaced median smoothers using {help smooth:smooth}. Up to three points of discontinuity are suggested by each local rule. {cmd:oaoutlier} applies one global rule by splitting the series into two parts and fitting a linear or quadratic function to each. This rule suggest the point of split that minimizes the BIC as point of discontinuity. The number of dmus is limited to the value of {help matsize:matsize}. For large samples {cmd:oaoutlier} requires substantial computing time. Nevertheless, using {help orderalpha:orderalpha} within a loop rather than {cmd:oaoutlier} for generating a series of super-efficiency shares is strongly discouraged as {cmd:oaoutlier} is still much faster in performing this task. {title:Technology Definition} {dlgtab:Model} {phang} {it:varname} must uniquely identify dums. {it:varname} may be either a numeric or a string variable. {phang} {opt inputs(varlist1)} specifies inputs to the analyzed production process. At least one input-variables is required. Any variable in {it:varlist1} needs to be numeric and strictly positive. Dmus with missing or non-positive values in {it:varlist1} are dropped. {phang} {opt outputs(varlist2)} specifies outputs from the analyzed production process. At least one output-variables is required. Any variable in {it:varlist2} needs to be numeric and strictly positive. Dmus with missing or non-positive values in {it:varlist2} are dropped. {it:varlist2} may not share any variable with {it:varlist1}. {marker options}{...} {title:Options} {dlgtab:Main} {phang} {opt ort(input|output)} specifies whether {it:input} or {it:output} oriented efficiency is considered. {phang} {opt nalpha(#)} specifies the number {it:#} of values tried for alpha. The default and maxium allowed value is {it:N}. To reduce computing time, one may specify smaller values. However, suggested points of discontinuity are not invariate to the choice of {opt nalpha(#)}. Very small values may cause error.{p_end} {dlgtab:Detection} {phang} {opt nobic} suppresses the global BIC based detection rule. {phang} {opt norough} suppresses the local detection rule that is based on the rough series of differences in differences. {phang} {opt nosmooth} suppresses the local detection rule that is based on the smoothed series of differences in differences. If {opt nobic}, {opt norough}, and {opt norough} are simultaneously specified, {cmd:oaoutlier} just plots the series of shares of super-efficient dmus. {phang} {opt smoother(string)} uses {it:string} as smoother for smoothing the twice differenced series; see {help smooth:smooth} on how to specify smoothers. The default is the series {it:3}, {it:3RSREH}, {it:3RSR5REH}, {it:3RSR5R7REH}, {it:3RSR5R7R9REH} of smoothers that is applied until the smoothed series does not include any negative value or until the last smoother has been applied. {dlgtab:Reporting} {phang} {opt noplot} suppresses plotting the results to a graph. {phang} {opt dots} invokes displaying loop dots. One dot is displayed for each dmu being analyzed. {title:Examples} {phang2}{cmd:. oaoutlier firm, inputs(capital labor energy) outputs(durables perishables) ort(output) dots}{p_end} {pstd}Here {it:oaoutlier} and {it:orderalpha} precede {it:dea} and already respect {it:dea}s requirements for naming variables:{p_end} {phang2}{cmd:. oaoutlier dmu, inputs(i_capital i_labor i_energy) outputs(o_durables o_perishables) ort(output)}{p_end} {phang2}{cmd:. orderalpha dmu, inputs(i_capital i_labor i_energy) outputs(o_durables o_perishables) ort(output) invert alpha(`r(asmooth1)') gen(oaeffi)}{p_end} {phang2}{cmd:. dea if oaeffi <= 1, rts(vrs) ort(o)}{p_end} {title:Saved results} {pstd} {cmd:orderalpha} saves the following in {cmd:r()}: {synoptset 20 tabbed}{...} {p2col 5 20 24 2: Scalars}{p_end} {synopt:{cmd:r(asmooth#)}}point of discontinuity suggested by local rule using smoothed series ({it:#} may take values up to 3){p_end} {synopt:{cmd:r(arough#)}}point of discontinuity suggested by local rule using rough series ({it:#} may take values up to 3){p_end} {synopt:{cmd:r(abic1)}}point of discontinuity suggested by global rule{p_end} {synopt:{cmd:r(ssmooth#)}}share of super-efficients dmus that corresponds to {opt r(asmooth#)}{p_end} {synopt:{cmd:r(srough#)}}share of super-efficients dmus that corresponds to {opt r(arough#)}{p_end} {synopt:{cmd:r(sbic1)}}share of super-efficients dmus that corresponds to {opt r(abic1)}{p_end} {synoptset 20 tabbed}{...} {p2col 5 20 24 2: Macros}{p_end} {synopt:{cmd:r(cmd)}}{cmd:oaoutlier}{p_end} {synopt:{cmd:r(cmdline)}}command as typed{p_end} {synopt:{cmd:r(title)}}{cmd:Order-alpha based outlier detection}{p_end} {synopt:{cmd:r(ort)}}either {opt input} or {opt output}{p_end} {synoptset 20 tabbed}{...} {p2col 5 20 24 2: Matrices}{p_end} {synopt:{cmd:r(oaresult)}}{it:Nx2} matrix of values tried for alpha and corresponding shares of super-efficient dmus{p_end} {p2colreset}{...} {title:References} {pstd} Daraio, C. and L. Simar (2007). {it:Advanced robust and nonparametric methods in efficiency analysis: Methodology and applications}. Springer, New York. {title:Also see} {psee} Manual: {manlink R smooth} {psee} {space 2}Help: {manhelp smooth R:smooth}{break} {psee} Online: {helpb dea}, {helpb orderalpha}, {helpb orderm}{p_end} {title:Author} {psee} Harald Tauchmann{p_end} {psee} Rheinisch-Westfälisches Institut für Wirtschaftsforschung (RWI){p_end} {psee} Essen, Germany{p_end} {psee} E-mail: harald.tauchmann@rwi-essen.de {p_end} {title:Disclaimer} {pstd} This software is provided "as is" without warranty of any kind, either expressed or implied. The entire risk as to the quality and performance of the program is with you. Should the program prove defective, you assume the cost of all necessary servicing, repair or correction. In no event will the copyright holders or their employers, or any other party who may modify and/or redistribute this software, be liable to you for damages, including any general, special, incidental or consequential damages arising out of the use or inability to use the program. {p_end} {title:Acknowledgements} {pstd} This work has been supported in part by the Collaborative Research Center "Statistical Modelling of Nonlinear Dynamic Processes" (SFB 823) of the German Research Foundation (DFG). {p_end}