Genomics and epidemiology of the P.1 SARS-CoV-2 lineage in Manaus, Brazil

Unmitigated spread in Brazil Despite an extensive network of primary care availability, Brazil has suffered profoundly during the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic. Using daily data from state health offices, Castro et al. analyzed the pattern of spread of COVID-19 cases and deaths in the country from February to October 2020. Clusters of deaths before cases became apparent indicated unmitigated spread. SARS-CoV-2 circulated undetected in Brazil for more than a month as it spread north from S o Paulo. In Manaus, transmission reached unprecedented levels after a momentary respite in mid-2020. Faria et al. tracked the evolution of a new, more aggressive lineage called P.1, which has 17 mutations, including three (K417T, E484K, and N501Y) in the spike protein. After a period of accelerated evolution, this variant emerged in Brazil during November 2020. Coupled with the emergence of P.1, disease spread was accelerated by stark local inequalities and political upheaval, which compromised a prompt federal response. Science, abh1558 and abh2644, this issue p. 821 and p. 815

B razil has experienced high mortality during the COVID-19 pandemic, recording >300,000 deaths and >13 million reported cases, as of March 2021. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and disease burden have been highly variable across the country, with the state of Amazonas in north Brazil being the worst-affected region (1). Serological surveillance of blood donors in Manaus, the capital city of Amazonas and the largest city in the Amazon region, has suggested >67% cumulative attack rates by October 2020 (2). Similar but slightly lower seroprevalences have also been reported for cities in neighboring regions (3,4). However, the level of previous infection in Manaus was clearly not sufficient to prevent a rapid resurgence in SARS-CoV-2 transmission and mortality there during late 2020 and early 2021 (5), which has placed substantial pressure on the city's health care system.
Here, we show that the second wave of infection in Manaus was associated with the emergence and rapid spread of a new SARS-CoV-2 lineage of concern, named lineage P.1. The lineage carries a distinctive constellation of mutations (table S1), including several that have been previously determined to be of virological importance (6)(7)(8)(9)(10) and that are located in the spike protein receptor binding domain (RBD), the region of the virus involved in recognition of the angiotensin-converting enzyme-2 (ACE2) cell surface receptor (11). Using genomic data, structure-based mapping of mutations of interest onto the spike protein, and dynamical epidemiology modeling of genomic and mortality data, we investigated the emergence of the P.1 lineage and explored epidemiological explanations for the resurgence of COVID-19 in Manaus.

Identification and nomenclature of the P.1 lineage in Manaus
In late 2020, two SARS-CoV-2 lineages of concern were discovered through genomic surveillance, both characterized by sets of notable mutations: lineage B.1.351, first reported in South Africa (12), and lineage B.1.1.7, detected in the UK (13). Both variants have transmitted rapidly in the countries where they were discovered and spread to other regions (14,15). Analyses indicate that B.1.1.7 has higher transmissibility and causes more severe illness as compared with those of previously circulating lineages in the UK (1,16,17).
Our early data indicated the presence of a novel SARS-CoV-2 lineage in Manaus that contained 17 amino acid changes (including 10 in the spike protein), three deletions, four synonymous mutations, and a four-base-pair nucleotide insertion compared with the most closely related available sequence (GISAID ID: EPI_ISL_722052) ( Fig. 1B; lineage-defining mutations can be found in table S1) (27). This lineage was given a new designation, P.1, on the basis that (i) it is phylogenetically and genetically distinct from ancestral viruses, (ii) associated with rapid spread in a new area, and (iii) carries a constellation of mutations that may have phenotypic relevance (26). Phylogenetic analysis indicated that P.1-and another lineage, P.2 (19)-were descendants of lineage B.1.1.28 that was first detected in Brazil in early March 2020 (Fig. 1B). Our preliminary results were shared with local teams on 10 January 2021 and published online on 12 January 2021 (27). Concurrently, cases of SARS-CoV-2 P.1 infection were reported in Japan in travelers from Amazonas (28). As of 24 February 2021, P.1 had been confirmed in six Brazilian states, which in total received >92,000 air passengers from Manaus in November 2020 (Fig. 1C). Genomic surveillance first detected lineage P.1 on 6 December 2020 (Fig. 1A), after which the frequency of P.1 relative to other lineages increased rapidly in the tested samples from Manaus ( Fig. 1D; lineage frequency information can be found in fig. S5). Retrospective genome sequencing might be able to recover earlier P.1 genomes. Between 2 November 2020 and 9 January 2021, we observed 7137 SARI cases and 3144 SARI deaths in Manaus (Fig. 1A). We generated a total of 182 SARS-CoV-2 sequences from Manaus during this period. This corresponds to one genome for each 39 SARI cases in Manaus, and this ratio is >100-fold higher as compared with the average number of shared genomes per reported case during the same period in Brazil.

Dating the emergence of the P.1 lineage
We used molecular-clock phylogenetics to understand the emergence and evolution of lineage P.1 (25). We first regressed root-to-tip genetic distances against sequence sampling dates (29) for the P.1, P.2, and B.1.1.28 lineages separately (figs. S6 to S8). This exploratory analysis revealed similar evolutionary rates within each lineage but greater root-to-tip distances for P.1 compared with B.1.1.28 ( fig. S8), suggesting that the emergence of P.1 was preceded by a period of faster molecular evolution. The B.1.1.7 lineage exhibits similar evolutionary characteristics (13), which was hypothesized to have occurred in a chronically infected or immunocompromised patient (30,31).
To date the emergence of P.1, while accounting for a faster evolutionary rate along Faria  its ancestral branch, we used a local molecular clock model (32) with a flexible nonparametric demographic tree prior (33). Using this approach, we estimated the date of the common ancestor of the P.1 lineage to be around 15 November 2020 [median, 95% Bayesian credible interval (BCI), 6 October to 24 November 2020; mean, 9 November 2020] ( fig. S9). This is only 3 to 4 weeks before the resurgence in SARS-CoV-2 confirmed cases in Manaus (Figs. 1A and 2 and fig. S9). The P.1 sequences formed a single well-supported group (posterior probability = 1.00) that clustered most closely with B.1.1.28 sequences from Manaus (Fig. 2, "AM"), suggesting that P.1 emerged there. The earliest P.1 samples were detected in Manaus (34). The first known travel-related cases were detected in Japan (28) and São Paulo (table S3) and were both linked to travel from Manaus. Furthermore, the local clock model statistically confirmed a higher evolutionary rate for the branch immediately ancestral to lineage P.1 compared with lineage B.1.1.28 as a whole [Bayes factor (BF) = 6.04].
Our data indicate multiple introductions of the P.1 lineage from Amazonas to Brazil's southeastern states (Fig. 2). We also detected seven small well-supported clusters of P.2 sequences from Amazonas (two to six sequences, posterior probability = 1.00). Virus exchange between Amazonas state and the urban metropolises in southeast Brazil largely follows patterns of national air travel mobility ( Fig.  1D and fig. S10).

Infection with P.1 and sample viral loads
We analyzed all quantitative reverse transcription polymerase chain reaction (RT-PCR) SARS-CoV-2-positive results from a laboratory that has provided testing in Manaus since May 2020 ( Fig. 1A and data file S1), with the aim of exploring trends in sample quantitative RT-PCR cycle threshold (Ct) values, which are inversely related to sample virus loads and transmissibility (35). By focusing on data from a single laboratory, we reduced instrument and process variation that can affect Ct measurements.
We analyzed a set of quantitative RT-PCR positive cases for which virus genome sequencing and lineage classification had been undertaken (n = 147 samples). Using a logistic function (Fig. 3A), we found that the fraction of samples classified as P.1 increased from 0 to 87% in around 7 weeks (table S4), quantifying the trend shown in Fig. 1C. We found a small but statistically significant association between P.1 infection and lower Ct values, for both the E gene (lognormal regression, P = 0.029, n = 128 samples, 65 of which were P.1) and N gene (P = 0.01, n = 129 samples, 65 of which were P.1), with Ct values lowered by 1.43 [0.17 to 2.60, 95% confidence interval (CI)] and 1.91 (0.49 to 3.23) cycles in the P.1 lineage on average, respectively (Fig. 3B).
Using a larger sample of 942 Ct values (including an additional 795 samples for which no lineage information was available), we investigated Ct values across three time periods characterized by increasing P.1 relative abundance. Average Ct values for both the E and N genes declined through time, as both case numbers and the fraction of P.1 infections increased, with Ct values significantly lower in period 3 as compared with period 1 (E gene, P = 0.12 and P < 0.001 for comparison of time periods 2 and 3 to period 1; N gene, P = 0.14 and P < 0.001, respectively) (Fig. 3C). Analyses of Ct values for samples from a different laboratory, also based in Manaus, showed similarly significant declines between the first and third time periods defined here (P < 0.0001 for both E and N genes) ( fig. S11 and data file S3).
However, population-level Ct distributions are sensitive to changes in the average time since infection when samples are taken, so that median Ct values can decrease during epidemic growth periods and increase during epidemic decline (36). To account for this effect, we assessed the association between P.1 infection and Ct levels while controlling for the delay between symptom onset and sample collection. Statistical significance was lost for both data sets (E gene, P = 0. 15 confounding factor, we cannot distinguish whether P.1 infection is associated with increased viral loads (37) or a longer duration of infection (38).

Mathematical modeling of lineage P.1 epidemiological characteristics
We next explored epidemiological scenarios that might explain the recent resurgence of transmission in Manaus (39). To do this, we extended a semimechanistic Bayesian model of SARS-CoV-2 transmissibility and mortality (40)(41)(42) to include two categories of virus ("P.1" and "non-P.1") and to account for infection severity, transmissibility, and propensity for reinfection to vary between the categories. It also integrates information on the timing of P.1 emergence in Manaus using our molecular clock results (Fig. 2). The model explicitly incorporates waning of immune protection after infection, parameterized on the basis of dynamics observed in recent studies (16,43), to explore the competing hypothesis that waning of prior immunity might explain the observed resurgence (42). We used the model to evaluate the statistical support that P.1 possesses altered epidemiological characteristics compared with local non-P.1 lineages. Epidemiological model details and sensitivity analyses (tables S5 to S10) can be found in the supplementary materials. The model is fitted to both COVID-19 mortality data [with a correction for systematic reporting delays (44,45)] and the estimated increase through time in the proportion of infections due to P.1 derived from genomic data (table S4). We assumed that within-category immunity wanes over time (50% wane within a year, although sensitivity analyses varying the rapidity of waning are presented in table S7) and that cross-immunity (the degree to which previous infection with a virus belonging to one category protects against subsequent infection with the other) is symmetric between categories. Our results suggest that the epidemiological characteristics of P.1 are different from those of previously circulating local SARS-CoV-2 lineages, but the results also highlight substantial uncertainty in the extent and nature of this difference. Plausible values of transmissibility and cross-immunity exist in a limited area but are correlated (Fig. 4A, with the extent of immune evasion defined as 1 minus the inferred cross-immunity). This is expected because in the model, a higher degree of cross-immunity means that greater transmissibility of P.1 is required to generate a second epidemic. Within this plausible region of parameter space, P.1 can be between 1.7 and 2.4 times more transmissible (50% BCI, 2.0 median, with a 99% posterior probability of being >1) than local non-P1 lineages and can evade 21 to 46% (50% BCI, 32% median, with a 95% posterior probability of being able to evade at least 10%) of protective immunity elicited by previous infection with non-P.1 lineages, corresponding to 54 to 79% (50% BCI, 68% median) cross-immunity (Fig. 4A). The joint-posterior distribution is inconsistent with a combination of highly increased transmissibility and low cross-immunity and, conversely, also with near-complete crossimmunity but only a small increase in transmissibility (Fig. 4A). Moreover, our results further show that natural immunity waning alone is unlikely to explain the observed dynamics in Manaus, with support for P.1 possessing altered epidemiological characteristics robust to a range of values assumed for the date of the lineage's emergence and the rate of natural immunity waning (tables S5 and S7). We caution that these results are not generalizable to other settings; more detailed and direct data are needed to identify the exact degree and nature of the changes to the epidemiological characteristics of P.1 compared with previously circulating lineages.
We estimate that infections are 1.2 to 1.9 times more likely (50% BCI, median 1.5, 90% posterior probability of being >1) to result in mortality in the period after the emergence of P.1, compared with before, although posterior estimates of this relative risk are also correlated with inferred cross-immunity (Fig. 4B). More broadly, the recent epidemic in Manaus has strained the city's health care system, leading to inadequate access to medical care (46). We therefore cannot determine whether the estimated increase in relative mortality risk is due to P.1 infection, stresses on the Manaus Our model makes the assumption of a homogeneously mixed population and therefore ignores heterogeneities in contact patterns (differences in private versus public hospitals are provided in fig. S13). This is an important area for future research. The model fits observed time series data from Manaus on COVID-19 mortality (Fig. 4C) and the relative frequency of P.1 infections (Fig. 4D) and also captures previously estimated trends in cumulative seropositivity in the city (Fig. 4E). We estimate the reproduction number (R t ) on 7 February 2021 to be 0.1 (median, 50% BCI, 0.04 to 0.2) for non-P.1 and 0.5 (median, 50% BCI, 0.4 to 0.6) for P.1 (Fig. 4F). Three key mutations present in P.1-N501Y, K417T, and E484K-are in the spike protein RBD. The former two interact with human ACE2 (hACE2) (11), whereas E484K is located in a loop region outside the direct hACE2 in-terface (fig. S14). The same three residues are mutated with the B.1.351 variant of concern, and N501Y is also present in the B.1.1.7 lineage. The independent emergence of the same constellation of mutations in geographically distinct lineages indicates a process of convergent molecular adaptation. Similar to SARS-CoV-1 (48)(49)(50), mutations in the RBD may increase affinity of the virus for host ACE2 and consequently influence host cell entry and virus transmission. Recent molecular analysis of B.1.351 (51) indicates that the three P.1 RBD mutations may similarly enhance hACE2 engagement, providing a plausible hypothesis for an increase in transmissibility of the P.1 lineage. Moreover, E484K is associated with reduced antibody neutralization (6,9,52,53). RBD-presented epitopes account for~90% of the neutralizing activity of sera from individuals previously infected with SARS-CoV-2 (54); thus, tighter binding of P.1 viruses to hACE2 may further reduce the effectiveness of neutralizing antibodies. Faria (Fig. 3A), and green ribbons (dark = 50% BCI, light = 95% BCI) are the model fit to the data. (E) Estimated cumulative infection incidence for the P.1 and non-P.1 categories. Black data points with error bars are reversion-corrected estimates of seroprevalence from blood donors in Manaus (2). Colored ribbons are the model predictions of cumulative infection incidence for non-P.1 lineages (brown) and P.1 lineages (green). These points are shown for reference only and were not used to fit the model. (F) Bayesian posterior estimates of trends in reproduction number R t for the P.1 and non-P.1 categories.

Conclusion
We show that P.1 most likely emerged in Manaus in mid-November, where high attack rates have been previously reported. High rates of mutation accumulation over short time periods have been reported in chronically infected or immunocompromised patients (13). Given a sustained generalized epidemic in Manaus, we believe that this is a potential scenario for P.1 emergence. Genomic surveillance and early data sharing by teams worldwide have led to the rapid detection and characterization of SARS-CoV-2 and new variants of concern (VOCs) (25), yet such surveillance is still limited in many settings. The P.1 lineage is spreading rapidly across Brazil (55), and this lineage has now been detected in >36 countries (56). But existing virus genome sampling strategies are often inadequate for determining the true extent of VOCs in Brazil, and more detailed data are needed to address the impact of different epidemiological and evolutionary processes in their emergence. Sustainable genomic surveillance efforts to track variant frequency [for example, (57-59)] coupled with analytical tools to quantify lineage dynamics [for example, (60,61)] and anonymized epidemiological surveillance data (62, 63) could enable enhanced real-time surveillance of VOCs worldwide. Studies to evaluate realworld vaccine efficacy in response to P.1 are urgently needed. Neutralization titers represent only one component of the elicited response to vaccines, and minimal reduction of neutralization titers relative to earlier circulating strains is not uncommon. Until an equitable allocation and access to effective vaccines is available to all, nonpharmaceutical interventions should continue to play an important role in reducing the emergence of new variants.