Partial stratified ranked set sampling scheme for

Full text

Turn on search term navigation

1. Introduction

Obtaining a more efficient and cost-effective population mean estimation is one of the researcher’s major goals. This goal can be accomplished by modifying the selection procedure of existing sampling designs. RSS design provides a more efficient, economical, and unbiased estimation of the population parameter. RSS used a cost-free ranking mechanism of the sampling or experimental units, which reduces the cost of a survey. Mclntyre [1] was the first who proposed RSS as a sampling design for population mean estimation. Takahasi and Wakimoto [2] investigated that RSS provides an unbiased estimation of the population mean. They also verified that the sample mean under RSS is more precise than sample mean of simple random sampling (SRS). Dell and Clutter [3] showed that, regardless of whether the ranking is flawless or not, the sample mean based on RSS provides unbiased estimation of the population mean and is at least as successful as the sample mean based on SRS. Stokes [4] explained that concomitant variables that are easily available could be used for the ranking of the variable of interest. Al-Omari et al. [5] introduced simple and generalized Z ranked set sampling schemes (ZRSS). They demonstrated that population mean estimators based on the proposed designs produce more efficient results for non-uniform distributions. Chen et al. [6] studied the maximum likelihood estimator (MLE) of the location parameter based on the moving extremes ranked set sampling (MERSS). For more modified schemes of RSS, see Al-Nasser et al. [7], Bani-Mustafa et al. [8], Samawi [9], Salehi and Ahmadi [10], Majd and Saba [11], Sevinc et al. [12], Khan et al. [13] and Ali et al. [14], Monjed et al. [15]. For different efficient classes of estimators under RSS and stratified ranked set sampling (StRSS), see Bhushan et al. [16–21].

Samawi [22] extended the ordinary RSS design to the StRSS scheme. He suggested that samples under RSS to be selected from each stratum. He conducted an empirical investigation and found that StRSS was a better predictor of the population mean than traditional stratified simple random sampling (StSRS). Samawai and Saeid [23] presented the stratified extreme ranked set sampling (StERSS) design to estimate the population mean. In their suggested scheme, the population was divided into ’H’ strata, then extreme ranked set samples were identified from each stratum. Ibrahim et al. [24] presented the stratified median ranked set sampling (StMRSS) design. In their scheme, the MRSS was used for the selection of samples from each stratum. They showed by simulations that StMRSS was more efficient than some of its counterpart designs. Al-Omari et al. [25] investigated the stratified percentile ranked set sampling (StPRSS) method. They conducted a numerical study and showed that StPRSS based mean estimator was more efficient than mean based on some of its counterpart designs. Mahdizadeh and Zamanzade [26] proposed the Stratified Pair Ranked Set Sampling (StPRSS) scheme. They showed that the suggested design provided a more effective estimation of the mean and utilized minimum cost compared to the StRSS method. Stratified unified ranked set sampling (StURSS) with flawless ranking was proposed by Chainarong et al. [27]. In the presence of outliers, Ali et al. [28] proposed stratified extreme-cum-median ranked set sampling (StEMRSS) to estimate the mean of heterogeneous populations. They demonstrated that, when compared to other StRSS systems, StEMRSS works well. Under lacking observations, Viada and Allende [29] created StRSS. The information for estimating the mean is completed using imputation based on ratio principles. They covered the necessary aspects of imputation and selection of sample processes. They used RSS models to compute imputation for stratified populations. In RSS, no actual measurements are made; instead, the units within each sample are sorted visually. When the data comes in batches of different sizes, the ranking is difficult and produces big inaccuracies, or it takes a long time. In this paper, a partial stratified ranked set sampling (PStRSS) design is proposed, a very effective design when all the experiment units are not available at same time. The suggested design selects the units using StRSS and SRS methods, i.e., ’c’ units are selected using StRSS and ‘d’ units using SRS. Thus, the total sample size n = c+d is selected. As a result, it is more effective than SRS and needs less sampling units and rankings than the RSS. Section 2 provides some existing ranked set sampling designs. In section 3, we present our proposed design and compare it with other designs through simulation studies. Section 4 illustrates the proposed design using real data, while inferences and final remarks are presented in section 5.

2. Some existing ranked set sampling designs

2.1 Ranked set sampling

The following is how a size n ranked set sample is chosen: Choose n² elements randomly from a target population and divide them into n sets of n varying sizes. The selected units within each group are then ranked visually or using any other low-cost technique. From the first set the least ranked unit is identified. The second least ranked unit from the second set is identified. The process is continued until the highest-ranking unit from the last set is selected. This technique can be performed r times to obtain rn RSS units. Consider the research variable X, which has a probability density function f_x(x), a cumulative density function F_x(x), a mean μ_x and a variance . Let be the n independent simple random sample each of size n from j^th cycle taken from f_x(x), where j = 1,2,…,r. The RSS mean and variance are as follows,(1)

The variance is as follows,(2)where, μ_i(i:n) is the mean of i^th order statistics and is the variance of SRS.

2.2 Stratified ranked set sampling

The StRSS procedure divides the population into H mutually exclusive and exhaustive strata in order to obtain a size n sample. Then, from the stratum h in r_h cycle, select an independently ranked set sample of size j_hn_h units. Let be the i^th judgment order statistic in the j^th cycle of the ranked set sample taken from stratum h. The observations are independent, but not distributed identically. The common mean and variance for fixed h and i, ‘s (j = 1,…r_h) are distributed identically, and are denoted by μ_(i:n)h and σ²_(i:n)h respectively. The population mean under StRSS is given by,(3)where,

is the mean estimator of RSS in stratum h. The variance is as follows,(4)where(5)

Combining (4) and (5) we get,(6)(7)

3. Proposed design

In this section, PStRSS design is suggested. The design is effective for some distributions when the required units for the StRSS method are not available or arrived at batches. Then some units are selected by the StRSS method, and others are selected using the SRS method. The procedure of geting PStRSS with sample size n is as follows:

1. Define coefficient c such that c = βn, where 0≤β≤0.5.

2. Select 2c units from the population using simple random sampling technique.

3. The remaining n−2c units are selected as; divide the population into H mutually exclusive and exhaustive strata. Then, select independently ranked set sample of size j_hd_h (for d = n−2c) units from stratum h in j^th cycle.

4. The above procedure can be repeated r time in order to get rn units.

3.1 Estimation

The population mean estimator under PStRSS is as below,where,(8)

Variance of is as under,(9)

Lemma 1

is an unbiased estimator of μ for symmetric distributions.

Proof.

Taking expectation on both sides of Eq (8) we have,(10)(11)

In symmetric distribution μ_(i:n,h) = μ (David and Nagaraja, [30])

3.2 Simulation

The mean estimator’s performance under PStRSS is compared with the mean estimator of StRSS by conducting a simulation study. The efficiency of the PStRSS based mean estimator is evaluated using both symmetric and non-symmetric distributions. The distributions consider for simulation study are, Normal (0,1), Lognormal (0,1), Weibull (0.5,1) and Logistic (0,1). Relative efficiency (REs) of PStRSS and StRSS concerning SRS is investigated using 50,000 iterations from two strata. The simulation is done by R 4.5 software. The sample is taken from two strata: (4,4), (5,5) and (6,8).

The equation used for REs of symmetric distributions is as below,where, t defines the sampling methods as, PStRSS and StRSS.

The equation used for REs of asymmetric distributions is as below,where, t defines the sampling method as, PStRSS.

In Fig 1 the REs is presented. The Fig 1 demonstrates that for Normal (0,1) and Logistic (0,1) distributions, the StRSS have higher REs than PStRSS (c = 1) and PStRSS (c = 2). But StRSS utilized more units than PStRSS. Moreover, the REs of the mean estimator of PStRSS for given distributions is higher than SRS because the REs is higher than one. Fig 1 further shows that for Weibull (0.5,1) and Lognormal (0,1), the REs of the mean estimator of suggested design PStRSS (c = 1) is higher than StRSS and SRS. But for the mean estimator of PStRSS (c = 2), the REs is higher than StRSS in a high sample size. In Lognormal (0,1), PStRSS (c = 1) outperforms StRSS in terms of mean estimator performance. While for PStRSS (c = 2), the REs is lower than StRSS. Thus, Fig 1 shows that PStRSS for c = 1,2 is more efficient than SRS. Simultaneously, the suggested design is more efficient in some distributions depending on the value of ‘c’ and sample size.

[Figure omitted. See PDF.]

3.3 Estimation of the median

For the skewed type of distributions, the median as a measure of location is recommended. For example, the distributions of income, production, and expenditure are skewed. In this section, the median estimator under the suggested scheme PStRSS is investigated. The performance of PStRSS’s median estimator is tested in a simulation study. For this purpose, R 4.5 software is used, and the simulation is repeated 50,000 times to estimate the mean square error of the median estimator under PStRSS for c = 1 and 2.

The REs for the estimator of median for h^th stratum is given as,where, t defines the sampling methods as, PStRSS, StRSS.

The result of the simulation study is presented in Fig 2. For Normal (0,1) distribution, the StRSS is performing better than PStRSS for c = 1 and 2. While in Lognormal (0,1) distribution, the suggested scheme PStRSS (c = 1) performs better than StRSS in estimating the median. In PStRSS (c = 2), the median’s estimation is precise than StRSS for a large sample size. Fig 2 further reveals that the median estimator under PStRSS for c = 1,2 estimates the median more efficiently than StRSS for large sample size. For Weibull (0.5,1), the efficiency of estimating the median estimator under PStRSS (c = 1) is higher than StRSS, while for PStRSS (c = 2), the efficiency is higher for a large sample only. The Fig 2 suggests that the PStRSS for c = 1 and c = 2 estimate the population median more efficiently than the SRS based median estimator.

[Figure omitted. See PDF.]

4. An application to COVID-19 confirmed cases in Pakistan

The Coronavirus, 2019 is a newly identified virus that causes an infectious disease. The Coronavirus (COVID-19 Cases Data, 2020, [31]) COVID-19 daily confirmed cases for the last two months, i.e., October and November 2020, in Pakistan is considered.

We are interested in selecting n = 4 samples (days) using the suggested scheme PStRSS, StRSS, and StSRS. Procedure of this scheme is applied to identify 4 samples. Months are considered as strata. The equal allocation method is used in PStRSS, StRSS, and StSRS designs. In PStRSS, 2 samples are selected by the StRSS method, and 2 samples are selected by the SRS method.

The mean and variance for the suggested scheme PStRSS, StRSS, and StSRS are presented in Table 1. Here n = 4 is the sample size and r = 1 is number of cycle. In the suggested scheme the PRSS samples are drawn from each stratum. For selection of a ranked set sample of size n = 4, the researcher must observe 16 units but due to limited budget and time, it is hard to apply the RSS technique. In this condition under the PStRSS the researcher need to observe only (4, 1) = 10 units, (4, 2) = 4 units from two strata. It is clear from Table 1 that the estimator of mean based on PStRSS not only outperforms as compared with their competitors based on StRSS and StSRS but also need less number of units than StRSS. The mean from population data in the PStRSS is 1219.333, which is closer to the population mean than StRSS and StSRS mean. Thus, the suggested scheme estimated the confirmed cases of COVID-19 in Pakistan more efficiently.

[Figure omitted. See PDF.]

5. Conclusion and final remarks

This paper suggests a new efficient and cost-effective design, PStRSS. This design utilizes less sampling units than traditional StRSS and provides a more efficient mean and median estimator of the population in some distributions. Therefore, when the required sample sizes are not available to conduct StRSS, instead of switching to SRS, one should use PStRSS, which is extra efficient than SRS and, in some distributions, estimate the average and median more precisely. Our suggested design PStRSS outperforms for Normal (0,1) and Logistic (0,1) distributions than SRS and for Weibull (0.5,1) and Lognormal (0,1) it is more efficient than StRSS and SRS for PStRSS (c = 1). It is concluded that PStRSS for c = 1,2 is more efficient than SRS and outperforms StRSS in terms of mean estimator for c = 1. The suggested approach is also more effective in some distributions depending on the sample size and value of c. Moreover, the suggested scheme estimate the average confirmed cases of COVID-19 efficiently. It is recommended when daily confirmed cases of some area could not be found; then, the PStRSS sampling method should be used to estimate the daily confirmed cases of COVID-19.

Citation: M M, Almanjahie IM, Ismail M, Cheema AN (2023) Partial stratified ranked set sampling scheme for estimation of population mean and median. PLoS ONE 18(2): e0275340. https://doi.org/10.1371/journal.pone.0275340

About the Authors:

Maria M

Roles: Formal analysis, Investigation, Resources, Writing – original draft

Affiliation: Department of Statistics, COMSATS University, Islamabad, Lahore, Pakistan

Ibrahim M. Almanjahie

Roles: Funding acquisition, Investigation, Resources

Affiliations: Department of Mathematics, College of Science, King Khalid University, Abha, Saudi Arabia, Statistical Research and Studies Support Unit, King Khalid University, Abha, Saudi Arabia

ORICD: https://orcid.org/0000-0002-4651-3210

Muhammad Ismail

Roles: Supervision

Affiliation: Department of Statistics, COMSATS University, Islamabad, Lahore, Pakistan

ORICD: https://orcid.org/0000-0002-5591-9511

Ammara Nawaz Cheema

Roles: Conceptualization, Data curation, Formal analysis, Methodology, Software

E-mail: [email protected]

Affiliation: Department of Mathematics, Air University, Islamabad, Pakistan

ORICD: https://orcid.org/0000-0002-3616-6024

References

1. McIntyre GA. Method for unbiased selective sampling using ranked sets. Australian Journal of Agricultural Research. 1952; 3(4): 385–90.

2. Takahasi K, Wakimoto K. On unbiased estimates of the population mean based on the sample stratified by means of ordering. Annals of the Institute of Statistical Mathematics. 1968; 20(8): 1–31.

3. Dell TR, Clutter JL. Ranked set sampling theory with order statistics background. Biometrics. 1972; 28(2): 545–555.

4. Stoke SL. Ranked set sampling with concomitant variables. Communications in Statistics-Theory and Methods. 1977; 6(12): 1207–1211.

5. Al-Omari AI, Almanjahie IM. New improved ranked set sampling designs with an application to real data. Computers, Materials & Continua. 2021; 67(2): 503–1522.

6. Chen WX, Long CX, Yang R, Yao DS. Maximum likelihood estimator of the location parameter under moving extremes ranked set sampling design. Acta Mathematica Applicata Sinica, English Series. 2021; 37(1): 101–108.

7. Al-Naseer , Amjad D, Bani-Mustafa A. Robust extreme ranked set sampling. Journal of Statistical Computation and Simulation. 2009; 79(7): 859–867.

8. Bani-Mustafa A, Al-Nasser AD, Aslam M. Folded ranked set sampling for asymmetric Distributions. Communications for Statistical Applications and Methods. 2011; 18(1): 147–53.

9. Samawi HM. Varied Set Size Ranked Set Sampling With Applications to Mean and Ratio Estimation. International Journal of Modelling and Simulation. 2011; 31(1): 6–13.

10. Salehi M, Ahmadi J. Record ranked set sampling scheme. Metron. 2014; 72(3):351–365.

11. Majd MH, Saba AR. Robust extreme double ranked set sampling. Journal of Statistical Computation and Simulation. 2018; (9): 1749–1758.

12. Sevinc B, Cetintav B, Esemen M, Gurler S. RS Sampling: A Pioneering Package for Ranked Set Sampling. The R Journal. 2018; 88(4), 401–415.

13. Khan Z, Ismail M, Samawi HM. Mixture Ranked Set Sampling for Estimation of Population Mean and Median. Journal of Statistical Computation and Simulation. 2019; 90(4): 573–585.

14. Ali A, Butt MM, Iqbal K, Hanif M, Zubair M. Estimation of population mean by using a generalized family of estimators under classical ranked set sampling. RMS: Research in Mathematics & Statistics. 2021; 8 (1): 105–113.

15. Monjed H, Samuh M, Omar H, Hossain MP. Mixed double-ranked set sampling: A more efficient and practical approach. Revstat–Statistical Journal. 2021; 19(1): 145–160.

16. Bhushan S, Kumar A. Novel log type class of estimators under ranked set sampling. Sankhya B. 2021; 84(1): 1–27.

17. Bhushan S, Kumar A, Singh S. Some efficient classes of estimators under stratified sampling. Communications in Statistics—Theory and Methods. 2021; 1–30.

18. Bhushan S, Kumar A, Lone SA. On some novel classes of estimators under ranked set sampling. AEJ-Alexandria Engineering Journal. 2021; 61: 5465–5474.

19. Bhushan Shashi, Kumar Anoop. On optimal classes of estimators under ranked set sampling, Communications in Statistics—Theory and Methods. 2022; 51(8), 2610–2639.

20. Bhushan S, Kumar A. Predictive estimation approach using difference and ratio type estimators in ranked set sampling. Journal of Computational and Applied Mathematics. 2022; 410, 114214.

21. Bhushan S, Kumar A. An efficient class of estimators based on ranked set sampling. Life Cycle Reliability and Safety Engineering. 2022; 11: 39–48.

22. Samawi HM. Stratified ranked set sample. Pakistan Journal Of Statistics. 1996; 12(1): 9–16.

23. Samawai HM, Saeid LJ. Stratified extreme ranked set sample with application to ratio estimators. Journal of Modern Applied Statistical Methods. 2004; 3(1): 117–133.

24. Ibrahim K, Syam M, Al-Omari AI. Estimating the population mean using stratified median ranked set sampling. Applied Mathematical Sciences. 2010; 4(47): 2341–2354.

25. Al-Omari AI, Kamarulzaman I, Syam MI. Investigating the use of stratified percentile ranked set sampling method for estimating the population mean. Proyecciones (Antofagasta). 2011; 30(3): 351–368.

26. Mahdizadeh M, Zamanzade E. Stratified pair ranked set sampling. Communications in Statistics-Theory and Methods. 2018; 47(24): 5904–5915.

27. Chainarong P, Chanankarn S, Suwiwat W. Stratified Unified Ranked Set Sampling with Perfect Ranking. Burapha Science Journal. 2020: 25(3): 1026–1034.

28. Ali A, Butt MM, Azad MD, Ahmed Z, Hanif M. Stratified extreme-cum-median ranked set sampling. Pakistan Journal of Statistics. 2021; 37(3): 215–235.

29. Viada-Gonzalez CE, Allende-Alonso SM. Stratified ranked set sampling for estimating the population mean with ratio-type imputation of the missing values. Ranked Set Sampling Models and Methods. IGI Global Publisher. 2022; 141–171.

30. David HA, Nagaraja HN. Order Statistics (3 ed.). Inc. Hoboken. New Jersey: John Wiley & Sons. 2003.

31. Novel Coronavirus (COVID-19) Cases Data. (2020). Access at 1November 2020, Retrieved from https://data.humdata.org/dataset/novel-coronavirus-2019-ncov-cases.

Word count: 3133

Show less

© 2023 M et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Ranked set sampling is an alternative to simple random sampling, which uses the least amount of money and time. The ranked set sampling (RSS) is modified to obtain a more efficient and cost-effective estimator of population parameters. This paper aims to bring a more efficient and cost-effective design than stratified ranked set sampling and simple random sampling. In some distributions, the suggested method used fewer sample units than stratified ranked set sampling and gives a more efficient estimation of population parameters. In symmetric distributions, the proposed design, called "partial stratified ranked set sampling" yields an unbiased estimator of the population mean. The design is illustrated with practical data of COVID-19 confirmed cases.

Details

Title

Partial stratified ranked set sampling scheme for estimation of population mean and median

Author

Maria, M; Almanjahie, Ibrahim M

; Ismail, Muhammad

; Ammara Nawaz Cheema

First page

e0275340

Section

Research Article

Publication year

2023

Publication date

Feb 2023

Publisher

Public Library of Science

e-ISSN

19326203

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1371/journal.pone.0275340

ProQuest document ID

2776986733

Partial stratified ranked set sampling scheme for estimation of population mean and median

Jump to:

Full text

Abstract

Details

Suggested sources