Serology for SARS-CoV-2: Apprehensions, opportunities, and the path forward

Defining acceptable performance characteristics for SARS-CoV-2 serological tests requires factoring in how the results will be used. Serological testing for SARS-CoV-2 has enormous potential to contribute to COVID-19 pandemic response efforts. However, the required performance characteristics of antibody tests will critically depend on the use case (individual level versus population level).

Serological testing for SARS-CoV-2 has enormous potential to contribute to COVID-19 pandemic response efforts. However, the required performance characteristics of antibody tests will critically depend on the use case (individual level versus population level).
Making data-driven decisions on how to fight the COVID-19 pandemic without completely shutting down economies will require better tools to understand the extent of transmission. The current crisis presents an opportunity to rethink how health systems generate and use surveillance data and how to harness the power of serological tests and seroepidemiology. The world's health systems are rushing to develop and implement testing for clinical use, evaluations of social policy, and quantification of populationlevel risk, which has brought into sharp focus the challenges facing surveillance programs throughout the world. There is an urgent need to monitor variations in disease transmission across populations and geographies in near real time. Rapid detection of active cases and contact tracing, using direct tests for presence of the virus (acute-phase diagnosis), is the cornerstone of containment strategies. For later phases of pandemic control, when the key questions involve when, where, and how to lift confinement measures and relax social distancing constraints, serological testing to measure antibody responses to the virus becomes paramount to refine understanding of transmission intensity and population susceptibility.
Antibody tests to detect exposure to SARS-CoV-2, the virus responsible for COVID-19, are rapidly becoming available (a list is maintained by FIND at www.finddx.org/covid-19/pipeline/), with the majority configured for detection of immunoglobulin G (IgG) antibodies to the Spike (S) protein of the virus, although other isotypes and antigens are being explored. Testing platforms under development include classical solid-phase immunoassays [mostly enzyme-linked immunosorbent assay (ELISA) formats, ranging from manual or semi-automated to high-flow automatons capable of handling several thousands of samples per day], methods based on beadbased flow cytometry and chemiluminescence (capable of high throughput and multiplexing), and lateral flow immunochromatographic assays [that have attracted the most attention because of potential point-of-care (POC) usage and suitability for home selftesting]. Assessing performance characteristics of these new tests is extremely important and challenging, raising issues regarding thresholds for sensitivity/specificity, potential cross-reactivity with other coronaviruses (particularly other subgroup B coronaviruses), the use of neutralization assays as a gold standard reference, difficulties harmonizing results reporting across different platforms, concerns for quality control in manufacturing, and, most importantly, a lack of baseline data required for test interpretation (1). The performance of different test platforms is likely to vary considerably. For instance, POC lateral flow assays are likely to be fraught with more problems of sensitivity/ specificity than ELISA formats; however, their low cost and ease of use will facilitate more rapid scale-up and widespread adoption.
Despite enormous and ongoing efforts to study immune responses to COVID-19 in different clinical settings, to date, there is insufficient data and poor understanding of the magnitude and duration of antibody responses (IgM, IgG, and IgA) after asymptomatic, mild, and severe infections. We do not yet understand how antibody responses vary across diverse populations with different genetic backgrounds, comorbidities, or infection histories. In this article, we discuss the use case for individual-level versus population-level serological testing, with a focus on IgG testing applications. We emphasize the dangers of using current serologic tests for individual-level risk assessments but highlight the potential power of deploying population-level serological testing (i.e., serosurveillance or seroepidemiology), even with assays of moderate sensitivity/specificity.

USE CASES FOR SARS-COV-2 SEROLOGY
At the individual level, serologic tests are frequently used to support clinical diagnosis by determining recent or prior infection [to supplement polymerase chain reaction (PCR) detection] or to determine vaccination status and requirements for boosting. In vaccine trials, individual assessments of antibody end points may be used to determine serostatus before enrollment as a tool to reduce bias, simplify analyses, and minimize required sample sizes (2). Specific to SARS-CoV-2, a widely discussed idea in the media has been the issuance of "immune passports," the proposed use of serology to infer immunity and thus enable a person to work on the front lines or return to daily work routines. Such an application must be predicated on an established surrogate of protection, a given antibody end point associated with clinical protection from infection, and a test with sufficient specificity to ensure people are not unintentionally put in harm's way (3). Serology tests with relatively high but imperfect specificity may lead to substantial false-positive results when used in lowincidence settings (Fig. 1A). For example, in a setting where 5% of the population has been infected, a test with 96% specificity and 90% sensitivity would lead to just 54% of positive results, indicating a true infection (i.e., positive predictive value). In addition to the risks of false positives, false negatives may occur in some previously infected persons who fail to produce antibodies specific to the antigens/epitopes in a given assay, or whose antibodies have already quickly waned, or, when used during an on-going epidemic, among those who have not yet mounted a specific antibody response (4,5). For these individuals that do not mount a measurable antibody response despite having been infected, obtaining permission to return to work could be onerous. Further considerations that may undermine the individual use case is that even with a true-positive antibody result, we do not know how well that translates to protection or immunity nor whether those positive by an antibody test could still shed virus and infect others.
At the population level, representative cross-sectional serosurveys can provide aggregate "snapshots" of infection history and immunity of a population. Understanding the proportion of the population infected by SARS-CoV-2 cannot be assessed on the basis of PCR-confirmed cases alone because of variations in testing practices, timing of sampling, and the clinical spectrum of disease (e.g., asymptomatic infections). In contrast to case data, seroepidemiological datasets provide a less biased picture of risk of death (infection fatality rate) and the amplitude of transmission in different populations and can highlight disparities in infection rates without typical health-seeking behavior biases. Understanding age-specific or spatial distribution of susceptibility could guide policymakers about where to intervene and to what degree, by helping to answer questions such as: What IgG seroprevalence in children is acceptable to allow schools to open? Do infection attack rates differ between children and adults? Populationlevel surveys could also help estimate the probability and timing of future waves of disease, which will critically depend upon duration of immunity (6), measure the impact of interventions such as physical distancing or vaccination, and, in later stages, confirm the absence of transmission.
Here, we underscore key differences between individual-and population-level use cases and emphasize that different use cases will require tests with different performance characteristics. While assays that "certify" an individual's immunity need to be correlated with protection and have nearperfect specificity (to limit the number of false positives, when seroprevalence is low), assays to ascertain population-level exposure would have utility as long as the sensitivity and specificity are well defined for the target population, allowing for adjustment of seroprevalence estimates (Fig. 1B). Optimal thresholds for sensitivity/specificity can be "tuned" depending on local prevalence and intended use of the assay. For example, when conducting a serosurvey in low-prevalence settings, to achieve better precision of point estimates of disease burden, the assay specificity should be prioritized, typically at the cost of sensitivity. This can be achieved by raising the cutoff value for the assay used, for example, by setting higher optical density readings as the threshold for positivity in an ELISA. Similarly, in a high-prevalence setting, test sensitivity should be prioritized at the cost of specificity. Thus, we recommend the consideration of multiple threshold values (cutoffs) for assays that can be flexibly used in different contexts.

FOUNDATIONAL STUDIES TO ENHANCE THE UTILITY AND INTERPRETATION OF SARS-COV-2 SEROLOGY
While many SARS-CoV-2 serological assays may have insufficient performance characteristics (sensitivity/specificity) to warrant use at the individual level and the World Health Organization currently recommends restricting antibody testing to research use only, we argue that these imperfect tests may nevertheless provide highly valuable tools to address critical public health questions, such as the safety of relaxing stay-at-home orders or school closures or evaluations of alternative intervention measures. To fully realize the benefits of population-level seroepidemiological studies, a number of fundamental questions must be addressed, relating to test performance, the dynamics of antibody responses in relation to infection, and the link between antibody responses and immunity (Table 1). Answering these questions across different populations and epidemiologic contexts will require various study designs, which we view as key for optimal interpretation of the growing number of

Measured seroprevalence (%) Adjusted seroprevalence estimate (%)
Likely range of current true seroprevalence Adjusted seroprevalence = 0 when measured seroprevalence Adjusted seroprevalence = 1 when measured seroprevalence > 1 − sensitivity Adjusted seroprevalence = measured seroprevalence The adjusted seroprevalence estimate is obtained after correcting the measured seroprevalence for the imperfect sensitivity and specificity of the assay. Estimated seroprevalence is therefore a function of test performance and is defined as (proportion of positive tests + (spec − 1))/ (sens + spec − 1) (9, 10). The source code for figure is posted at https://github.com/HopkinsIDD/covid-science-immuno.

GOVERNANCE
Serosurveillance for SARS-CoV-2 will only be capable of contributing to actionable public health information if serology measurements flow into efficient data pipelines. Scale-up of serological testing for pandemic response must therefore be accompanied by a governance model at the subnational, national, and international levels and by an operational research agenda that evaluates the utility of assays within specific contexts. With the plethora of new tests in development and diverse testing strategies, there is an urgent need for national-level strategies to enable pooling of results generated from different methods and sources. Nationallevel governance will be required to provide oversight for sample collection and processing and linkage to personal data and to coordinate results analysis to the appropriate scale for policy relevance. Much like a national census is translated into infrastructure appropriations, serosurveillance could be used for resource allocations (and future vaccination efforts) to target transmission hotspots.
Data from carefully designed serostudies are urgently needed before widespread adoption or implementation of antibody testing programs. To ensure comparison across studies, there is a need for harmonization of assay protocols, sharing of reference standards, and a set of best practices for reporting results. Because seroepidemiological studies will require measurement of healthy individuals, various strategies for opportunistic sampling of individuals in community settings should be explored, as described in a proposed Global Serum Bank (7). A host of ethical and privacy issues will need to be addressed; we suggest that serosurveillance platforms should incorporate broad consent, enabling future screening of serum collections for multiple biomarkers of public health concern beyond SARS-CoV-2 alone. The SARS-CoV-2 pandemic has highlighted the value of transparency in disease surveillance for all nations. We see a role for international coordination of national seroepidemiology programs to facilitate standardizing methods and dissemination of results among national public health laboratories.
In summary, seroepidemiological studies and integrated serosurveillance platforms are urgently needed to guide and tailor SARS-CoV-2 response efforts and will con-tinue to be critical for mitigating postpandemic resurgence. Coordinated serosurveillance provides opportunities to combine control efforts for different diseases into one coordinated program; this may be particularly valuable to assess impact of the COVID-19 crisis on routine immunization programs (8). Platforms should be designed with a longer term vision beyond COVID-19, to generate capacity for "precision public health" to monitor additional major diseases, and provide insights into how disease occurrence is interrelated with other health risk factors. Last, we stress that investing now in a fundamental and operational research agenda will allow us to rapidly develop serosurveillance as a powerful tool for population-level public health; however, the complexity of using serological assays within low-prevalence settings to inform individual-based risk assessments, i.e., to inform decisions regarding return to work, is dangerously premature. To statistically adjust for imperfect test performance in serosurveys (Fig. 1B).
Prepandemic samples representing diverse background exposures to other coronaviruses.
To inform choice of assay for use in specific populations and settings.

Longitudinal studies of infected individuals
To describe postinfection antibody decay dynamics.
Confirmed SARS-CoV-2 infections across a spectrum of clinical severity.
Combine antibody decay dynamics with cross-sectional surveys to estimate time since infection and reconstruct historical transmission trends and burden estimates.
Cohort studies of high-risk individuals To determine which antibody responses (and magnitude) are best correlated with protection.
Uninfected close contacts (e.g., household members) of confirmed SARS-CoV-2-infected individuals, followed to identify secondary infections (through regular symptom screening and virus testing).
Identify correlates of protection to inform population-level assessments of immunity measured in cross-sectional surveys through comparing preexisting (at time of enrollment) levels of antibodies between those who become infected and those who do not.

Vaccine efficacy trials
To determine which antibody responses (and magnitude) are best correlated with protection.
General population or subgroups at highest priority to benefit from a vaccine with regular follow-up for symptomatic disease.
Identify correlates of protection to inform population-level assessments of immunity measured in cross-sectional surveys through comparing levels of antibodies between those who are protected and those who are not.