Free fatty acid binding pocket in the locked structure of SARS-CoV-2 spike protein


COVID-19, caused by severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2), represents a global crisis. Key to SARS-CoV-2 therapeutic development is unraveling the mechanisms driving high infectivity, broad tissue tropism and severe pathology. Our 2.85 Å cryo-EM structure of SARS-CoV-2 spike (S) glycoprotein reveals that the receptor binding domains (RBDs) tightly bind the essential free fatty acid (FFA) linoleic acid (LA) in three composite binding pockets. The pocket also appears to be present in the highly pathogenic coronaviruses SARS-CoV and MERS-CoV. LA binding stabilizes a locked S conformation giving rise to reduced ACE2 interaction in vitro. In human cells, LA supplementation synergizes with the COVID-19 drug remdesivir, suppressing SARS-CoV-2 replication. Our structure directly links LA and S, setting the stage for intervention strategies targeting LA binding by SARS-CoV-2.

Seven coronaviruses are known to infect humans. The four endemic human coronaviruses OC43, 229E, HKU1 and NL63 cause mild upper respiratory tract infections while pandemic virus SARS-CoV-2, and earlier SARS-CoV and MERS-CoV, can cause severe pneumonia with acute respiratory distress syndrome, multi-organ failure, and death.

SARS-CoV-2 has acquired functions that promote its harsh disease phenotype. SARS-CoV-2 causes severe inflammation and damage to endothelial cells in the heart, kidneys, liver and intestines, suggestive of a vascular infection rather than a purely respiratory disease. The attachment of SARS-CoV-2 to a host cell is initiated by the spike protein trimer (S), which decorates the outer surface of the virus, binding to its cognate receptor angiotensin-converting enzyme 2 (ACE2), with higher affinity than SARS-CoV. A S1/S2 polybasic furin protease cleavage site distinguishes SARS-CoV-2 from SARS-CoV or other closely related bat coronaviruses and serves to stimulate entry into host cells and cell-cell fusion. Inside the host cell, human coronaviruses remodel the lipid metabolism to facilitate virus replication. Infection by SARS-CoV-2 triggers an unusually impaired and dysregulated immune response and a heightened inflammatory response working in synergy with interferon production in the vicinity of infected cells to drive a feed-forward loop to up-regulate ACE2 and further escalate infection.

In the search for additional functions that contribute to the pathology of infection, we determined the structure of the SARS-CoV-2 S glycoprotein by cryo-EM. We produced SARS-CoV-2 S as a secreted trimer in MultiBac baculovirus-infected Hi5 insect cells (fig. S1). Highly purified protein was used for cryo-EM data collection (fig. S2 and table S1). After 3D classification and refinement without applying symmetry (C1) we obtained a 3.0 Å closed conformation from 136,405 particles and a 3.5 Å open conformation with one receptor-binding domain (RBD) in the up position from 57,990 particles (figs. S2 and S3). C3 symmetry was applied to the closed conformation particle pool yielding a 2.85 Å map (Fig. 1A and figs. S2 and S3).


The structure of S displays the characteristic overall shape observed for coronavirus S proteins in the closed and open conformations with the closed form (~70%) predominating in our data set (Fig. 1A and figs. S2 to S4). Model building of the closed form revealed additional density in the RBDs in our structure. The tube-like shape of this density was consistent with a fatty acid, with size and shape similar to LA bound to other proteins. Liquid chromatography coupled ESI-TOF mass spectrometry (LS-MS) analysis confirmed the presence of a compound with the molecular weight of LA in our purified sample.

Hallmarks of FFA-binding pockets in proteins are an extended “greasy” tube lined by hydrophobic amino acids which accommodates the hydrocarbon tail, and a hydrophilic, often positively charged anchor for the acidic headgroup of the FFA. In our structure, a hydrophobic pocket mostly shaped by phenylalanines forms a bent tube into which the LA fits well. The anchor for the headgroup carboxyl is provided by an arginine (R408) and a glutamine (Q409) from the adjacent RBD in the trimer, giving rise to a composite LA-binding site. We confirmed the presence of LA in all three binding pockets in the S trimer in the unsymmetrized (C1) closed structure (fig. S6). Similarly, masked 3D classification focusing on the RBD domains could not identify any unoccupied pockets (fig. S7).

Our S construct contains alterations as compared to native SARS-CoV-2 S namely addition of a trimerization domain and deletion of the polybasic cleavage site, neither of which alter S conformation appreciably. Glycosylation sites are located away from the LA-binding pocket and largely native in our structure (table S2). Thus, neither mutations nor glycosylation are likely to impact the LA-binding pocket. We compared S and RBD produced in insect cells with mammalian produced S to identify any potential influence of differences in glycosylation on ACE2 binding by competition enzyme-linked immunosorbent assay (ELISA). All three reagents bound ACE2 efficiently. We further confirmed ACE2 binding by S using SEC with purified proteins. The LA-binding pocket and the receptor binding motif (RBM) are distal and non-overlapping. Notably, in the LA-bound S the RBM is ordered and buried at the interface between RBDs while it was disordered in previously described SARS-CoV-2 S cryo-EM structures.

SARS-CoV-2 S can also adopt an open conformation (fig. S4) which is compatible with binding ACE2. In previous apo S cryo-EM structures about 60-75% of the S trimers were in the open conformation, contrasting our observation of ~70% in the closed conformation. This could be due to LA stabilizing the closed conformation, and if so LA would be expected to reduce ACE2 binding. We performed surface plasmon resonance (SPR) experiments with biotinylated ACE2 immobilized on a streptavidin-coated chip. We first determined the KD of the RBD/ACE2 interaction to validate our assay. Our value (26 nM, fig. S9C) is in good agreement with previous studies (44 nM) obtained by SPR with the RBD immobilized and ACE2 as analyte. Apo S was prepared by applying Lipidex, the established method for removing lipids from lipid-binding proteins. A KD of 0.7 nM was obtained for the apo S/ACE2 interaction (fig. S9A). For the LA-bound S/ACE2 interaction we obtained a KD of 1.4 nM (fig. S9B). We consistently obtained a markedly reduced resonance unit (RU) signal for LA-bound S as compared to apo S at the same concentrations (Fig. 2D and fig. S9, A and B). This correlates with the apo state having a higher percentage of S trimers in the open, ACE2-accessible conformation.

We characterized the affinity of the LA interaction both experimentally and computationally. Our SPR assays utilizing immobilized RBD yielded a binding constant of ~41 nM exhibiting a slow off-rate, consistent with tight binding of LA (fig. S10). Repeated molecular dynamics simulations of the entire locked LA-bound spike trimer (3 × 100 ns) using GROMACS-2019 corroborated the persistence of stable interactions between LA and the spike trimer (movies S1 and S2). The affinity of LA binding to the spike trimer will likely be higher than to the RBD alone, taking into account polar headgroup interactions with R408 and Q409 of the adjacent RBD (Fig. 1E). The resolution of the RBDs in our open S cryo-EM structure was insufficient to either assign, or rule out, a ligand-bound pocket (fig. S3). However, the slow off-rate observed with the RBD monomer (fig. S10) suggests that LA binding could be maintained when the S trimer transiently converts into the open conformation. This is supported by our observation that LA was retained during S purification in spite of S trimers adopting the open form ~30% of the time (fig. S2) and by our MD simulations with a modeled ligand-bound open spike trimer (movie S3) in which all three LAs remained bound over 500 ns.

Next, we investigated the effect of LA in experiments using live SARS-CoV-2 virus to infect human epithelial cells. Remdesivir is an RNA-dependent RNA polymerase inhibitor and the first anti-viral drug showing benefit in the treatment of COVID-19 in clinical trials, albeit with considerable side effects at the doses required. LA supplementation at 50-100 μM concentrations was previously shown to affect coronavirus entry and replication. We administered remdesivir at 20, 64 and 200 nM concentration, supplementing with 50 μM LA. Our results revealed synergy, with the dose of remdesivir required to suppress SARS-CoV-2 replication markedly reduced by adding LA.

We superimposed our LA-bound structure on previous SARS-CoV-2 apo S structures in the closed conformation and identified a gating helix located directly at the entrance of the binding pocket. This gating helix, comprising Ty365 and Tyr369, is displaced by about 6 Å when LA is bound, thus opening the pocket. In the apo SARS-CoV-2 S trimer, a gap between adjacent RBDs places the hydrophilic anchor residues ~10 Å away from the position of the LA headgroup. Upon LA binding, the adjacent RBD in the trimer moves toward its neighbor, and the anchor residues Arg408 and Gln409 lock down on the headgroup of LA. Overall, this results in a compaction of trimer architecture in the region formed by the three RBDs giving rise to a locked S structure.

We investigated whether the LA-binding pocket is conserved in the seven coronaviruses that infect humans (Fig. 4A and table S3). Sequence alignment shows that all residues lining the hydrophobic pocket and the anchor residues (Arg408/Gln409) in SARS-CoV-2 are fully conserved in SARS-CoV (Fig. 4A). Structural alignment of LA-bound RBDs within the trimer of SARS-CoV-2 and “apo” SARS-CoV RBDs reveals that the LA-binding pocket is present in SARS-CoV. The greasy tube is flanked by a gating helix as in SARS-CoV-2, with Arg395/Gln396 of SARS-CoV positioned 10 Å and 11 Å from the entrance, respectively, virtually identical to apo SARS-CoV-2 (Figs. 3C and 4B). In MERS-CoV, the gating helix and hydrophobic residues lining the pocket are also present. Tyr365, Tyr369 and Phe374 are substituted by likewise hydrophobic leucines and a valine, respectively (Fig. 4, A and C). The Arg408/Gln409 pair is not conserved, however, we identify Asn501/Lys502 and Gln466 as potential anchor residues, located on a β-sheet and an α-helix within the adjacent RBD, up to 11Å away from the entrance (Fig. 4C). Thus, the greasy tube and hydrophilic anchor appear to be present in MERS-CoV, suggesting convergent evolution. In HCoV OC43, gating helix and hydrophobic residues lining the pocket are largely conserved, while Tyr365, Tyr369 and Phe374 are replaced by methionines and alanine, respectively (Fig. 4A). Arg413 is located on the same helix as Arg408/Gln409 in SARS-CoV-2 and could serve as a hydrophilic anchor (Fig. 4D). No gap exists in this presumed “apo” form structure between the RBDs which appear already in the locked conformation (Fig. 4D and fig. S11). In HCoV HKU1, the hydrophobic residues are again largely conserved, but a charged residue (Glu375) is positioned directly in front of the entrance, obstructing access for a putative ligand (Fig. 4E). The RBDs of HCoVs 229E and NL63 adopt a very different fold (fig. S13), and many of the LA-binding residues are not present, hampering predictions of a binding site for fatty acids.

In summary, we find four molecular features mediating LA binding to SARS-CoV-2, and potentially also SARS-CoV and MERS-CoV S proteins: a conserved hydrophobic pocket, a gating helix, amino acid residues pre-positioned to interact with the LA carboxy headgroup, and loosely packed RBDs in the “apo” form. In contrast, in each of the four common circulating HCoVs, it appears that one or more of these four architectural prerequisites are lacking in the S protein structures (Fig. 4 and figs. S11 and S12). LA binding to SARS-CoV-2 S triggers a locking down of the hydrophilic anchor and a compaction of the RBD trimer (Fig. 3, C and D). In addition to stabilizing the closed conformation, this could also help stabilize the S1 region comprising the N-terminal domain and the RBD. The RBM, central to ACE2 binding, appears to be conformationally preorganized in our structure (Fig. 2C) indicating a generally more rigid RBD trimer when LA is bound. While direct crosstalk in between the LA-binding pocket and the RBM is not apparent from our structure (Fig. 2C), the conformational changes in the RBD trimer (Fig. 3) could impact ACE2 docking and infectivity as indicated by our SPR assays showing reduced levels of S binding in the presence of LA (Fig. 2D). The S protein’s tight binding of LA originates from a well-defined size and shape complementarity afforded by the pocket (Fig. 1, B and D). The LA-binding pocket thus presents a promising target for future development of small molecule inhibitors that, for example, could irreversibly lock S in the closed conformation and interfere with receptor interactions. It is noteworthy in this context that a fatty acid binding pocket was exploited previously to develop potent small molecule anti-viral drugs against rhinovirus, locking viral surface proteins in a conformation incompatible with receptor binding. These anti-virals were successful in human clinical trials.

A recent proteomic and metabolomic study of COVID-19 patient sera showed continuous decrease of FFAs including LA. Lipid metabolome remodeling is a common element of viral infection. For coronaviruses, the LA to arachidonic acid metabolic pathway was identified as central to lipid remodeling. We hypothesize that LA sequestration by SARS-CoV-2 could confer a tissue-independent mechanism by which pathogenic coronavirus infection may drive immune dysregulation and inflammation. Our findings provide a direct structural link between LA, COVID-19 pathology and the virus itself and suggest that both the LA-binding pocket within the S protein and the multi-nodal LA signaling axis, represent excellent therapeutic intervention points against SARS-CoV-2 infections.