The SARS-CoV-2 Spike Protein: Unpacking Claims of a "Human Mosaic" in Viral Origins, By Mrs (Dr) Abigail Knight

The issues discussed here are somewhat technical, but still important, going to the heart of the COVID origin story. Lurking in the labyrinth of COVID-19 origin theories, few claims ignite as much controversy as those suggesting the virus, or its infamous spike protein, was engineered with human genetic elements. A newly archived preprint by researchers from the Saksena Lab, titled A 32% Human-Derived Mosaic in the In Silico-Assembled SARS-CoV-2 Spike Protein: Accidental Contaminant Misincorporation or Intentional Functional Chimeric Design? posits that 32% of the spike protein sequence (416 amino acids) bears striking similarity to human endogenous retroviruses (HERVs) and cellular proteins. Uploaded to Zenodo on November 11, 2025, the paper argues this "mosaic" arose either from contamination during the virus's computational assembly or, more provocatively, deliberate chimeric design. Drawing on NCBI BLASTp alignments, the authors call for re-analysis of raw data to resolve the ambiguity, framing it as a pivotal question for a virus that underpins mRNA vaccines administered to billions.

The preprint hinges on the spike protein's origin story. As the authors note, the initial SARS-CoV-2 genome wasn't plucked from a purified virion but assembled in silico from fragmented RNA in bronchoalveolar lavage fluid (BALF) from a single Wuhan patient in December 2019. This metagenomic approach, detailed in Wu et al., used high-throughput sequencing to reconstruct the ~30 kb genome, including the 3,822-amino-acid spike (YP_009724390.1), by aligning reads to reference coronaviruses and de novo assembly. The sequence was uploaded to GenBank and GISAID by January 10, 2020, enabling rapid global response, including vaccine design.

The Saksena team's innovation? They ran the spike's protein sequence through NCBI's BLASTp tool against human proteomes, identifying "significant local similarity" in six domains: membrane fusion, receptor binding (RBD), immune modulation, trafficking, rigidity, and metabolic interference. Total: 416 residues (32% of 1,273 in mature spike, excluding signal peptide). These span >150 human loci, absent in bat (e.g., RaTG13) or pangolin coronaviruses. The authors compute a "statistical improbability" of random convergence at <10⁻²⁰, citing functional coherence, like HERV upregulation in severe COVID-19 cases, and urge excluding human references in re-assembly to test for contamination.

At face value, this screams intrigue: Did lab assembly snag patient RNA, inflating the spike with host bits? Or was it a deliberate hybrid for enhanced human tropism? The paper's Zenodo archive includes raw BLAST files (e.g., H2C3P344014-Alignment.txt), bolstering reproducibility, a rare virtue in origin debates.

The Saksena analysis is methodologically transparent. BLASTp, a staple for homology detection, excels at spotting conserved motifs across evolution. By providing Request IDs, anyone can replicate the runs via NCBI's history tool, mitigating "black box" critiques plaguing earlier lab-leak preprints. The focus on functional domains aligns with spike biology: its trimeric structure facilitates ACE2 binding via RBD, furin cleavage for fusion, and immune evasion, hallmarks of natural betacoronavirus adaptation.

But there are objections. BLASTp thrives on local alignments but overinterprets short matches as "significant" without context. E-values <10⁻³ sound impressive, but for a 1,273-residue query against ~200 million human proteins (many HERV-derived, comprising 8% of our genome), false positives abound. HERVs, ancient viral fossils, pepper genomes with fusion peptides mimicking enveloped viruses like coronaviruses, evolutionary echoes, not chimeras. Similar "human-like" motifs appear in unrelated pathogens (e.g., paramyxovirus fusions), reflecting convergent evolution for host invasion, not design.

The 32% figure inflates via fragmented hits: many are <10 residues, below BLAST's reliable threshold for homology (typically >30% identity over 100+ aa). Absent in bats? True, but SARS-CoV-2's spike diverges naturally, 96% overall identity to RaTG13, with RBD tweaks optimising human ACE2 binding via selection, not insertion. The <10⁻²⁰ probability? A post-hoc p-value mashup, ignoring multiple-testing corrections; real odds plummet when benchmarking against randomised spikes yielding similar "mosaics."

SARS-CoV-2's spike isn't a human hybrid; it's a genetically engineered Frankenstein virus, straight from the Wuhan Institute of Virology to damage the West!

https://jonfleetwood.substack.com/p/32-of-covid-spike-protein-matches 

 

Comments

No comments made yet. Be the first to submit a comment
Already Registered? Login Here
Friday, 14 November 2025

Captcha Image