COMPARISON AND SELECTION OF POPULATIONS WITH SPECIAL REFERENCE TO THE NORMAL DISTRIBUTION By Kerrie Mengersen B.A. Hons (U.N.E.) A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY OF THE UNIVERSITY OF NEW ENGLAND January 1988 Acknow ledgements There are many people who have assisted me with the content and production of this thesis, and to all these people, I give my thanks. There are a number, however, who have been of primary support and ll1ust be thanked personally. Firstly, IUY deepest gratitude goes to my supervisor, Eve Bofinger ,for her enthu­ siasm in the initiation of this thesis, and her assistance and encouragement in its development. I am fortunate to have had the opportunity to work with and learn from such an inspired statistician. I am also grateful to Richard Tweedie for his en­ during faith in the feasibility of applying ranking and selection theory to personnel selection. Other Inelnbers of the Statistics Department at U.N .E. are also acknowledged; Vic Bofinger~s support and advice is appreciated in particular. The staff in the Computer Centre at U.N .E. are thanked for their invaluable help in the production of this thesis. The Computer Centre and Siromath Pty Ltd must also be acknowl­ edged for their assistance in the developluent of the software package, PERSEL. My family deserves a special thanks, for without their love and encouragement I would not have attempted, much less cOll1pleted, this t.hesis. Firstly and finally, thank you Ges, for your support, advice and help t.hroughout. my candidature. [0 III Abstract In this thesis, a confidence bound approach is enlployed to investigate the goals of comparison and selection of the t best populations, and silllultaneous conlparisons with a control and with the best. A particular case studied in detail is that of the nOrInal distribution. The robustness of one of the selection procedures to this assulnption of nornlality is also considered. Enlphasis is placed on the practical application of these goals and the proposed procedures, in particular to the problem of personnel selection, for which a detailed methodology is discussed. IV NOTATION SYMBOL MEANING FIRST REFERENCE CS correct selection p. 83 PCS pro babili ty of correct selection p. 5 LFC Least favourable configuration p.5 CUB correct upper bounds p.85 CLB correct lower bounds p. 101 IZ indifference zone p. 4 PZ preference zone p.5 SS su bset selection p. 4 CB confidence bound p. 4 GLD generalised lambda distribution p. 29 s.l. stochastically increasing Table A.l a.s. almost surely p. 87 iff if and only if Table A.l cdf cumulative distribution function p. 87 pdf probability density function p.87 df degrees of freedom Table A.l :3 there exists p. 116 ~ tends to p. 18 T . . Table A.l IncreasIng 1 decreasing Table A.l ----' implies p. 58 E elenlent of p.55 C subset of or equal to p. 82 V for all p. 5 v SYMBOL n k p v p X~ 1V( B, (T2) O} = {R( k: - t + 1), R( k - t + 2), .. , R( k ) } = {R(l), R(2), .. , R(t)} = {i : Xi> )[R(k-t+l) - d}, d> 0 = {i : Ii < c -} Y R( t)}, 0 < c < 1 = {i : Xi > XR(k-t) + d}, d > 0 = {i : Ii < c- 1YR (t+l)}, 0 < c < 1 = {i : Xi - Xo > c}, c > 0 complement of I complement of G = nlilliEGt ()i = maxj€Gt OJ = maXiEGtl 1/'i = minj€Gtl 1/-'j Vll FIRST REFERENCE p.5 p. 72 p.5 p. 5 p. 135 p.26 p.22 p. 55 p. 72 p. 86 p. 108 p. 55 p. 72 p.6 p. 96 p. 100 p. 102 p. 109 p. 109 p. 109 p. 56 p.56 p. 73 p. 73 PL Qt}(C) Ur(b) Vp - r +1 ( C) Qr(c,b) n u L~ [a] 0- /3 Cu Aw Fw Afw Iz NOTATION MEANING = J~::(l - F(x - d))id(F(x))k-t = Jooo J~::(l- F(x - dS/dJYd(F(x))k-tdG,AS/cr) = roo J+OO(l _ F( upl/2-dSju )td(F( ))k-tdG (5/ ) Jo -00 (1-p)l/2 U . l/ cr = (k - t + 1)Pk - 1,t-l(d) - (k - t)Pk,t-l(d) + Pk-t+I,I(d) - 1 = t JoOO(l- F(cy))k-t(F(y))f-IdF(y) = J(FI ( x + b) r- I dF1 ( x) = J ( FI (x + c) )p- r dFo ( x ) = Vp-r+IUr(b) is distributed as, or intersection of, as appropriate union of forward differences (also a constant and function, pp. 11,22,32) integer part of a value just smaller than a paralneters of G LD standardised ith moment beta function 'U.th candidate desirable attribute nlaxin1ulll score attached to Aw l1nnilllUnl score attached to Aw zth individual specifying attribute weights vth judge vth judge~s score for uth candidate for wth attribute V111 FIRST REFERENCE p.57 p.58 p. 61 p. 134 p. 73 p. 110 p. 110 p. 113 p.58 p. 19 p.65 p.59 p. 71 p. 135 p. 135 p. 135 p.41 p. 41 p.41 p.43 p.41 p.42 p.42 Contents Acknowledgements Abstract Notation 1 INTRODUCTION 2 LITERATURE REVIEW 2.1 Introduction. ... . . . . 2.1.1 Indifference Zone Approach 2.1.2 Subset Selection Approach 2.1.3 Confidence Bound Approach 2.1.4 Software. .. 2.2 Selection of Best Populations 2.2.1 Indifference Zone Approach 2.2.2 Su bset Selection Approach 2.2.3 COlllbined Approaches 2.2.4 Confidence Bound Approach 2.2.5 Other Designs 2.2.6 Related Goals. 2.2.7 Other Distributions 2.3 COlllparison with a Control 2.3.1 IZ Approach IX iii iv iv 1 4 4 5 6 7 8 8 8 13 17 20 24 26 29 29 2.3.2 SS Approach .... . 2.3.3 CB Approach .... . 2.3.4 Other Considerations 2.4 Robustness of Ranking and Selection Rules 3 AN APPLICATION OF RANKING AND SELECTION 3.1 Introduction ......... . 33 35 36 37 39 39 3.2 The PERSEL Methodology 41 3.3 Ranking and selection procedures . . . 44 3.4 PERSEL - A software package for personnel selection 45 3.4.1 Overview of PERSEL 45 3.5 Exanlple............ 50 4 CONFIDENCE BOUNDS FOR THE T BEST POPULATIONS 54 4.1 Introduction ....... . 4.2 Location Parameter Case 4.2.1 Comparison of the worst selected and best non-selected pop­ ulations .... 4.3 Scale Parameter Case 4.3.1 Extension of Location Parameter Results 4.4 Example....................... 54 56 56 71 71 75 5 SUBSET SELECTION OF THE T BEST POPULATIONS 80 5.1 Introduction........................... 80 5.2 Selection of all t good populations - location parameter case 81 5.2.1 A previous approach 82 5.2.2 A new approach .. 84 5.2.3 Nornlal means case. 89 5.3 Selection of all t gc,>od populations - scale Parameter Case 5.3.1 A previous approach 5.3.2 A new approach .. 5.3.3 Nornlal variances case x 95 96 97 98 5.4 Selecting only good populations . 100 5.4.1 Location parameter case . 100 5.4.2 Scale parameter case . .. 102 5.5 Example. .. . . . . 103 6 SIMULTANEOUS COMPARISONS WITH A CONTROL AND WITH THE BEST 6.1 Introduction ... 6.2 Confidence Bounds . 6.2.1 Notation .... 6.2.2 Construction of bounds 6.3 Selection Decisions ...... . 6.4 Special Results for the Normal Means Case with Common Unknown Variance ..................... . 6.4.1 Upper bound on the joint confidence .. 6.5 Related Problems. . . . . . . . ...... . 105 105 108 108 109 117 118 123 123 6.5.1 Simultaneous comparison with a control and with the t best. 123 6.5.2 Simultaneous comparisons with restrictions on the "best" 127 6.6 Example................................ 128 7 ROBUSTNESS TO NORMALITY OF A SELECTION RULE 131 7.1 Introduction ..... . 7.2 The Selection Problem 7.3 The Generalised Lambda Distribution 7.3.1 Generating random variables . 7.3.2 Approxinlating empirical distributions 7.3.3 Ranking and selection using the GLD 7.4 Conlputation of the lower bound for given distributions 7.4.1 Alternatives if PL is snlall 7.5 General results ......... . 7.5.1 Enlpirical asseSSlllent of the effects of non-normality 7.6 Exanlple ............. . Xl 131 133 135 136 136 138 139 142 143 145 147 8 CONCLUSION BIBLIOGRAPHY APPENDICES A PRECIS OF RESULTS FROM LITERATURE B TABLES OF VALUES FOR CHAPTERS 4, 5 AND 6 C CONTOUR PLOTS FOR CHAPTER 7 xu 151 154 173 174 199 232 List of Tables 3.1 Attributes and Scores for Assessing Lecturers 52 3.2 Scores JTUV for all Lecturers 53 4.1 Existing tables for p~~)( d, p) 63 4.2 Silllulation results: Values of a for a = 0.05,0.01 72 4.3 Comparison of c and c . . . . . . . . . . . . . . . 77 5.1 COlllparison of the PCS under configurations C1 and C2 for v = 00. 91 5.2 Silllulated and calculated results under configuration C1 94 5.3 Comparison of c and c . . . . . . . . . . . . . . . . . . . . 99 504 Comparison of selections under Procedures R 3 , P2 and Peonj ' 104 6.1 Upper bounds for various combinations of { P*, p, p, v} 124 7.1 Characteristics of Given Distributions. Reproduced fronl Raulberg and Schmeisser (1972,1974) . . . . . . . . . . . . 137 7.2 Values of PL for selected syulmet.ric distributions 141 A.l Precis of results from literature review B.l Values of d satisfying p~~)( d) = P* .. B.2 Values of c satisfying Q~~l ( c) = P* B.3 Values of d for Procedure P2 for various P* BA Values of c for Procedure P3 for various P* . B.5 Values of be and Cc for case (i) for various P* . B.6 Values of bB and CB for case (ii) for various P* B.7 Values of bCB for case (iii) for various P* ... Xl 11 175 200 206 212 215 221 227 229 List of Figures 4.1 Plot of d vs l/v for various t ...... . 4.2 Plot of d2 vs In( 1 - p~~) (d)) for various l/ 4.3 Plot of c2 vs Q~~l (c) for various v . . . . . . 5.1 PCS at configuration C1 with v = IX) and P* = 0.90,0.10 C.1 Contour plots of PL for given (P*, k, t) combinations .... XIV 68 69 76 92 233