# Seroprevalence of SARS-CoV-2 antibody among urban Iranian population: findings from the second large population-based cross-sectional study | BMC Public Health | #microsoft | #hacking | #cybersecurity

### Study design and participants

This population-based cross-sectional study was conducted in 16 cities across 15 provinces in Iran, including Ardabil, Babol, Gorgan, Sari, Tabriz, and Urmia in the northern provinces, Hamedan, Kermanshah, Mashhad, Qom, Tehran, and Sanandaj in the central provinces, and Ahvaz, Kerman, Shiraz, and Zahedan in the southern provinces (Fig. 1). The detailed sampling method was described in the first study phase [3]. In brief, we randomly sampled the general population registered in the Iranian electronic health record system (SIB) based on their national identification numbers and invited them by telephone to refer to a healthcare center for data collection. SIB network belongs to a prospective population-based cohort study in which the demographic information and administrative health data for > 88% of Iranians (about 72 million people) are registered [6]. We included individuals who were aged ≥10 years old, and excluded those who were inaccessible or unwilling to participate in the study. Contrary to our previous serosurvey, we did not enroll high-risk individuals in the present study, that is, we did not include high-risk occupational groups (such as healthcare workers, etc.). We considered provincial capitals as clusters due to the heterogeneous pattern of COVID-19 dispersion across the provinces of Iran, as well as the factors such as population density, the high correlation of humidity in each province with COVID-19 prevalence, and intra-city and intra-provincial movements, which could affect the COVID-19 prevalence [7, 8].

### Sample size calculation

The sample size was calculated based on the estimated COVID-19 prevalence of 14.2% [9], a relative estimation error of 10%, considering a 5% precision, a non-response rate of 10%, and a design effect (Deff) of 1.75 to adjust for the nature of sampling by the following form:

Deff = 1 + d(n-1); where the intraclass correlation coefficient (d) was 0.05 and cluster (n = 16) was the total number of cities. The total sample size for this study by mentioned information was 9010 individuals. Sample size formulation was:

$$\mathrm{n}=\frac{\left({\mathrm{z}}_{1-\upalpha \left/ 2\right.}^2\right)\ast \mathrm{p}\ast \left(1-\mathrm{p}\right)}{{\mathrm{d}}^2}$$

### Procedures

After referring to a collaborating center, the participants were interviewed by trained research staff to complete questionnaires containing demographic details, past medical history, COVID-19-related symptoms, and COVID-19-related exposures. After collecting the required information, a 6 ml sample of venous blood was collected from each participant by a skilled laboratory technician into an EDTA-coated microtainer labeled with a unique participant identity number. Centrifuged plasma samples were then transported to a central laboratory on the dry ice (minus 20 degrees centigrade). Serum samples were assessed for the presence of SARS-CoV-2 nucleocapsid protein IgG and IgM antibodies, using Iran’s Food and Drug Administration-approved SARS-CoV-2 ELISA kits (Pishtaz Teb, Tehran, Iran) as per the manufacturer’s protocol [10]. The kits were designed based on indirect method in which SARS-CoV-2 specific nucleocapsid were coated in the 96-well plates. The recombinant SARS-CoV-2 nucleocapsid protein expressed in Baculovirus-insect cells consists of 1-419 amino acids and predicts a molecular mass of 47.08 kDa. The information on the sample collection and ELISA kits has been presented in detail previously [3].

### Test validation

Considering that the ELISA kits used in the present study were similar to those used in our previous serosurvey, their diagnostic performance and test validation were the same as previously described [3]. Similarly, we used two scenarios to adjust the seroprevalence rates in this study. Scenario 1 test performance (our own data on tests validation, including the sensitivity of 66.9% and specificity of 98.2%) was used as the primary test characteristic, and scenario 2 (combining manufacture’s data with our data on tests validation, including the sensitivity of 71.8% and the specificity of 98.2%) was used to be compared with the scenario 1 test-adjusted estimates.

### Covariates

Demographic information included sex, age, and residence city. Past medical history included the following self-reported comorbidities: heart disease, hypertension, chronic lung disease, asthma, diabetes, obesity, and renal disease. COVID-19-related symptoms included cough, fever, chills, sore throat, headache, dyspnea, diarrhea, anosmia, conjunctivitis, weakness, myalgia, arthralgia, altered level of consciousness, and chest pain experienced over the past 12 weeks [3, 11]. Participants were then categorized as asymptomatic, paucisymptomatic (one to three symptoms), or symptomatic (four or more symptoms). We also asked participants about their recent contact (over the past 12 weeks) with a confirmed COVID-19 patient.

### Statistical analysis

The statistical analyses were previously explained in detail [3]. Briefly, the overall crude seroprevalence of SARS-CoV-2-specific antibodies was estimated as a proportion of the positive tests to the total sample size. Age-sex-city population-weighted rates were computed within bootstrap samples using the 2016 population and household census in Iran as the standard population. Given the nature of participant selection, the bootstrap weighted seroprevalence rate for each combination of cities (Ahvaz, Ardabil, Babol, Gorgan, Hamedan, Kerman, Kermanshah, Mashhad, Qom, Sanandaj, Sari, Shiraz, Tabriz, Tehran, Urmia, and Zahedan), age (10-19, 20-29, 30-39, 40-49, 50-59, ≥60) and sex (male, female) was performed. Finally, to minimize the resultant bias due to imperfect sensitivity and specificity antibody tests, we calculated the test performance adjusted of weighted seroprevalence (bootstrap weight) for scenarios 1 and 2 based on Cassaniti’s et al. [12, 13] proposed following formula, where AP denoted adjusted prevalence, UP denoted unadjusted prevalence (apparent prevalence), Sp denoted test specificity, and Se denoted test sensitivity:

$$\mathrm{AP}=\frac{\mathrm{UP}+\mathrm{Sp}-1}{\mathrm{Se}+\mathrm{Sp}-1}$$

It should be noted that 95% confidence intervals (CIs) for unweighted seroprevalence were estimated using exact binomial models, and a bootstrap method was used to construct the 95% CIs for weighted and adjusted estimates [14, 15]. Categorical variables were reported as frequency and percentage. We calculated the total number of infections by multiplying infection prevalence by the total population of each province. We also assessed the distribution of SARS-COV-2 seropositivity according to sex, age, comorbidity, contact with COVID-19 patients, and symptoms, using chi-squared test. All statistical analyses were performed using Microsoft Excel and STATA version 14 (StataCorp, College Station, TX).