Pekar et al., (2025) Demolished!
Here you will find a scathing critique of the latest zoonotic claims by Pekar et al., (2025), which I recently published on X
1. Last sick joke of the Zoonati?
The recency and geographical origins of the bat viruses ancestral to SARS-CoV and SARS-CoV-2
Their banal and tedious findings summarized:
Fragments of human SARS-CoVs share recent common ancestors with bat viruses
SARS-CoV-like viruses have circulated in Asia for millennia
Ancestors of human SARS-CoVs likely circulated in China & Laos
Ancestors traveled unexpectedly fast
2. No Pangolins Allowed in this Paper!
“There is insufficient temporal signal when calibrating a molecular clock using tip dating with sarbecoviruses sampled from bats and pangolins, likely as a consequence of limited sampling across space & time”
Therefore, we used SARS-CoV-1 genomes!
3. Definitely No Pangolins!
As sampling locations of SARS-CoV-1, SARS-CoV-2 and pangolin sarbecoviruses likely do not represent where their direct bat virus ancestors circulated, we EXCLUDED their locations from phylogeographic analyses to avoid the IMPACT of dispersal of non-bat hosts!
4. True Desperation
Due to substitution saturation messing up their standard molecular clock approaches and to account for its effects on SARS-CoV-1 and SARS-CoV-2-like phylogenies when trying to understand long-term dispersal patterns of sarbecoviruses, “we applied the Prisoner of War model”.
5. Why did they resort to using the POW model?
The answer is simple yet appalling, it was done to show that:
“our inferences of the time of the ancestors of human SARS-CoVs and their closest bat sarbecoviruses are UNBIASED!”
6. Captain Obvious Strikes Again (1)
“we show that the ancestors of SARS-CoV-1 & SARS-CoV-2 likely circulated in horseshoe bat populations 100s to 1000s km away from the sites of the emergence of these viruses in humans & as recently as one to six years prior to this emergence”
7. Captain Obvious Strikes Again (2)
“Our findings indicate that there would not have been sufficient time for the direct bat virus ancestor to reach the locations of emergence of the human SARS-CoVs via normal dispersal through bat populations alone”
8. Circular Logic Based on Flawed & Debunked Claims
“Epicenter of SARS-CoV-2 pandemic was at a market in Wuhan that sold live wildlife from plausible intermediate host mammal species (FALSE)”
9. We couldn’t find them, so we will make some wild guesses?
“These geographically distant nodes therefore likely represent a lack of sarbecovirus sampling from areas between the distant locations rather than long-range bat movements”
10. The Zoonati Holy Grail is Full of Holes
“There are likely many more, still-undiscovered SARS-CoV-2-like viruses that would bridge the geographic gap between these regions, such that increased sampling in underexplored and under sampled regions would be HELPFUL”
11. They have even given up looking!
“We should not expect to find the direct ancestor of SARS-CoV-2 circulating in wild bats in future sampling Future sequencing efforts should aim for whole-genome sequences to detect all genomic fragments that descend from closely related ancestors.”
12. So, now it is “Please Wait for our Mosaic Genomes”
“The non-recombinant segments of these mosaic genomes are pieces of a complex puzzle that are crucial for understanding the past and future emergence of SARS-CoVs into the human population”
13. Pekar’s Limitations are as Clear as Daylight!
“Because there is insufficient temporal signal when calibrating a molecular clock using tip dating with sarbecoviruses sampled from bats & pangolins, we needed to calibrate the clock using SARS-CoV-1 & -2 genomes from viruses sampled from humans & civets”
14. Declaration of “interests”
S.L. has received consulting fees from EcoHealth Alliance!
J.E.P., M.A.S., A.R., M.W., and J.O.W. have received consulting fees and/or provided compensated expert (and biased) testimony on SARS-CoV-2 and the COVID-19 pandemic.
15. The Zoonati Paper under discussion
Can be found here in all its toothless and sterile glory, rather like an old hag dressed up as a sailor’s tart, in a last desperate attempt to convince those drunk, or deluded enough, to pay for her unpleasant services.
The recency and geographical origins of the bat viruses ancestral to SARS-CoV and SARS-CoV-2
16. The Great Bat Chase Farce (1)
Pekar et al. (2025) claim bat viruses raced pell-mell across Asia to spark SARS-CoV outbreaks.
BUT
Their dispersal velocities are pure fantasy, built on cherry-picked NRRs and a deranged, flimsy molecular clock calibrated with shaky human data and speculative priors.
17. The Great Bat Chase Farce (2)
They use untested models like POW (Prisoner of War), applied to sparse bat data with no temporal grounding.
The statistical underpinnings (low ESS, arbitrary NRR rates, poor convergence) produce unreliable ancestor dates, and completely collapse under careful scrutiny,
18. Recombination Fiction?
Pekar et al., are quite obsessed with “non-recombinant regions” and they enjoy slicing up and butchering genomes like there is no tomorrow.
BUT
Their GARD analysis is really no more than a “black box”, spewing out breakpoints and fictional recombination patterns to support their increasingly absurd claims.
19. Phylogeographic Phantoms
The paper’s phylogeographic model and bat-only maps are indeed masterclasses in biased guesswork.
Excluding pangolin and human data to “avoid bias”?
Rather like trying to solve a murder by ignoring the victim’s corpse!
https://www.researchgate.net/publication/390743356_The_Pangolin_Coronavirus_Papers
20. Flawed Substitution Saturation Correction
On Page 5, they use a rather flaky POW model to deal with substitution saturation, which they claim skews their precious deep-time viral phylogenies.
BUT
The POW model’s rate decline assumptions lack any validation for sarbecoviruses, thus making its correction of data extremely unreliable.
See Page 11.
21. Unreliable Ancestor Dates
They claim “unbiased” recent ancestor dates (<50 years), but they rely on human SARS-CoV data for clock calibration, due to alleged weak bat data signals (p. 16).
The low ESS values (>100) reveal poor statistical convergence, which only serves to undermine accurate dating (See p. 23).
22 A. Speculative Long-Term Dispersal
They then brazenly infer sarbecovirus dispersal over millennia using their POW-transformed phylogenies (p. 11).
Sparse sampling (250 genomes) and their untested POW assumptions limit the reliability of their claims, rendering them merely speculative (p. 16)
22 B. Biased Sampling Distorts Geography!
They use a dataset skewed toward Yunnan and Laos (p. 16), leading to phylogeographic models that place SARS-CoV ancestors far from Wuhan & Guangdong (p. 12).
This sampling bias, of course, completely undermines the reliability of their geographic inferences.
23. Neglecting Alternative Hypotheses
No SARS-CoV-like viruses near emergence sites?
They completely overlook non-bat reservoirs, like civets or pangolins, which could explain local circulation (p. 15).
This omission only serves to further weaken their claim of distant ancestor origins (p. 12).
24. Inconsistent Molecular Clock Rates
The paper misuses variable NRR-specific clock rates, which give inconsistent SARS-CoV ancestor dates (e.g., 1944–2014 for SARS-CoV-2, p. 9).
Without any validation of bat-specific rates, this approach has no rational grounding at all in terms of ecology (p. 14).
25. Speculative Hypothetical Taxa
They introduce unsampled taxa in Guizhou and Guangxi to test ancestor locations (p. 13).
These taxa lack sequence data and rely on assumed phylogenetic ties!
This merely adds further uncertainty and undermines their increasingly perverse phylogeographic inferences.
26. Implausible Dispersal Velocities
Even using their “hypothetical taxa”, they find very high dispersal velocity ranks for SARS-CoV-1 and SARS-CoV-2 (p. 13).
This implies that bat-to-human virus movement is unrealistic, which again serves to weaken their zoonotic origin hypothesis (p. 13).
27. Unsubstantiated Market Origin Claim
The authors, as is their custom, claim that SARS-CoV-2 originated at the Huanan seafood market (p. 15), but as we all know, they have yet to show any virological evidence of infected animals.
The cited studies rely on epidemiology, but have no direct proof, thus weakening their grandiose claims in this flawed paper.
28. Speculative Intermediate Hosts
The paper cites “plausible” intermediate hosts like raccoon dogs without genomic or serological data from market animals (p. 15).
Please see the following tweets for a through discussion of raccoon dogs as implausible intermediate hosts:
Eight Inconvenient Facts about Raccoon Dogs as intermediate hosts
Market Data Report https://x.com/BillyBostickson/status/1638085551197945857
Raccoon Dogs as intermediate hosts? https://x.com/BillyBostickson/status/1637116596245381121
Raccoon Dog analysis by@Daoyu15: https://archive.md/yyX0Z
Market Data by @jbloom_lab: https://x.com/jbloom_lab/status/1651428639676960769 and their paper: https://biorxiv.org/content/10.1101/2023.04.25.538336v1
New coronaviruses in raccoon dog sample Q61: https://x.com/stevenemassey/status/1649837163033530368
On the claimed immune activation of raccoon dogs at Huanan Seafood Market: https://x.com/stevenemassey/status/1864348559178391594
We have now discovered that Volker Theil’s lab failed to experimentally infect Raccoon Dogs! but did not make public their failure:
8. Beer, when asked why he and Freuling used a D614G mutated strain to barely infect raccoon dogs instead of the original 614D strain and failed to mention this in the paper:
Brazenly claimed that he ”didnt know whether 614D or 614G came first”!
By ignoring alternative origins, their zoonotic spillover claims thus lack any rigour.
29. Biased NRR Selection for recCA
They use GARD to define non-recombinant regions (NRRs), splitting genomes at recombination breakpoints (p. 7).
Arbitrary breakpoint thresholds bias NRRs towards high-identity regions, inflating recombinant common ancestor’s (recCA) similarity to SARS-CoVs.
Helpful Definitions
A. recCA (Recombinant Common Ancestor):
is a reconstructed viral genome that represents the hypothetical ancestor of a human SARS-CoV (SARS-CoV-1 or SARS-CoV-2) based on bat sarbecovirus sequences. Here, Pekar et al., create it by combining the most likely nucleotide sequences from multiple non-recombinant regions (NRRs) at the phylogenetic node closest to the human virus (p. 14).
For each NRR, the authors use Bayesian phylogenetic analysis (via BEAST) to infer the ancestral sequence at the parent node of SARS-CoV-1 or SARS-CoV-2 in the phylogeny. These sequences are then concatenated to form a single genome, the recCA, which is claimed to have >98% genetic identity to the human SARS-CoVs (p. 14).
The recCA aims to estimate how genetically similar the closest bat virus ancestor was to the human SARS-CoVs, supporting the idea of a recent zoonotic spillover from bats (p. 14).
the recCA’s high similarity is an artifact of selectively choosing NRRs that maximize identity, ignoring divergent regions (e.g., Spike gene), making the >98% claim misleading (p. 14).
B. NRRs (Non-Recombinant Regions):
are segments of the sarbecovirus genome that are inferred to be free of recombination events, meaning they have a single evolutionary history. Pekar et al., identify 31 NRRs for SARS-CoV-1-like viruses and 44 for SARS-CoV-2-like viruses using the GARD algorithm (p. 7).
The GARD algorithm analyzes genome alignments to detect recombination breakpoints, splitting the genome into regions (NRRs) where no recombination is detected. Each NRR is treated as an independent phylogenetic unit for analysis (p. 20).
NRRs allow the authors to analyze the evolutionary history of sarbecoviruses without the confounding effects of recombination, which can mix genetic material from different viral lineages. They use NRRs for clock calibration, divergence dating, and recCA construction (p. 7, 14).
However, the selection of NRRs is biased due to arbitrary GARD breakpoint thresholds (e.g., including breakpoints in 1/3 or 1/2 of models, merging those <100 nucleotides apart), which can favor high-identity regions, skewing results like the recCA’s similarity (p. 20).
C. GARD (Genetic Algorithm for Recombination Detection):
is a computational tool used to detect recombination breakpoints in genomic sequences by identifying regions where phylogenetic tree topologies differ, indicating past recombination events. Here, the authors apply GARD to sarbecovirus genomes to define NRRs (p. 7).
GARD analyzes aligned sequences (e.g., 250 sarbecovirus genomes) and tests for incongruent tree topologies across the genome. It outputs a set of breakpoints where recombination likely occurred, splitting the genome into NRRs. The authors used GARD separately for SARS-CoV-1-like (38 genomes, 56 breakpoints) and SARS-CoV-2-like (24 genomes, 92 breakpoints) datasets (p. 20).
GARD enables the authors to account for recombination, a common feature in sarbecoviruses, by isolating NRRs for phylogenetic and phylogeographic analyses, ensuring each region reflects a single evolutionary history (p. 7).
Unfortunately, GARD’s breakpoint selection is arbitrary (e.g., accepting breakpoints in 1/3 or 1/2 of models, merging close ones), which can bias NRR definitions toward regions that align closely with SARS-CoVs, inflating the recCA’s similarity and skewing results (p. 20).
30. Misleading recCA Similarity Claim
The recCA, a reconstructed ancestral genome, shows >98% identity to SARS-CoVs by concatenating selected NRRs (p. 14).
By willfully excluding divergent NRRs, like the Spike region, this only serves to overstate bat ancestor proximity (p. 14).
31. Unjustified CTMC Priors in BEAST
The authors use BEAST with CTMC priors for NRR-specific clock rates (p. 22).
These uninformative priors allow wide rate ranges, risking biased estimates without validation for bat sarbecoviruses,which undermine phylogenetic reliability (p. 16).
32. Risk of Statistical Overfitting
Firstly, their BEAST models’ flexibility in estimating NRR rates may overfit to human-derived data, which of course, perfectly aligns with their biased zoonotic narrative ( p. 23).
Secondly, the lack of any bat-specific validation weakens the strength of their grandiose phylogenetic inferences (p. 14).
Brief Explainer Section
A. Bayesian BEAST Models
Pekar et al., use BEAST (Bayesian Evolutionary Analysis Sampling Trees) to perform phylogenetic and phylogeographic inferences, including molecular clock calibration, divergence time estimation, and ancestral state reconstruction for 31 (SARS-CoV-1) and 44 (SARS-CoV-2) non-recombinant regions (NRRs) (p. 22–23). These models rely on complex statistical frameworks to estimate evolutionary parameters.
B. CTMC Priors Issue
The authors use Continuous-Time Markov Chain (CTMC) rate reference priors for NRR-specific clock rates during molecular clock calibration (p. 22). CTMC priors are uninformative, meaning they allow a wide range of possible rates, which can lead to overfitting or biased estimates if not constrained by strong data. The paper notes that these priors are applied to each NRR independently, with rates estimated from human SARS-CoV-1 and SARS-CoV-2 genomes (p. 22).
C. Cherry-Picking Concerns:
The selection of NRR-specific clock rates is biased due to variable rates.
Clock rates vary significantly across NRRs (e.g., Table S1), with no clear justification for why each NRR should have a unique rate (p. 9).
This variability can produce inconsistent divergence estimates, as seen in the wide range of ancestor dates (e.g., 1944–2014 for SARS-CoV-2, p. 9).
D. Lack of Validation
The authors do not validate whether CTMC priors are appropriate for bat sarbecoviruses, given the weak temporal signal in bat data (p. 16).
This risks overfitting to human-derived rates, skewing results to support a recent bat-to-human spillover (p. 14).
E. Statistical Weakness
The BEAST models require multiple Markov Chain Monte Carlo (MCMC) chains to achieve effective sample sizes (ESS) above 100 or 200, indicating poor convergence and statistical instability (p. 23).
This undermines the reliability of the phylogenetic inferences.
F. Implications
By using flexible CTMC priors and allowing variable NRR rates without robust validation, the models may artificially align with the zoonotic narrative, making the results less reliable, but of course more convenient for the authors’ clearly biased zoonotic agenda.
33. Exclusion of Non-Bat Data
The authors deliberately exclude pangolin & human Sarbecovirus data from phylogeographic analyses to focus on bats (p. 8).
Not testing other host contributions limits our understanding of alternative zoonotic and non-zoonotic pathways, leading to bias (p.15).
34. Variable NRR Counts
They split genomes into 31 NRRs for SARS-CoV-1 and 44 for SARS-CoV-2 using GARD.
This inconsistent segmentation, driven by different recombination signals, complicates comparisons and points to biased phylogenetic outcomes, which of course, support their biased agenda (p.7).
35. Ecologically Invalid Calibration
With weak bat data signals, the authors calibrate molecular clocks using human SARS-CoV genomes (p. 16).
Applying human rates to bats, without ecological validation, risks, or perhaps is designed to muddy the waters with inaccurate divergence estimates for sarbecoviruses (p. 14).
36. Arbitrary Taxa Constraints
The authors add hypothetical taxa with assumed phylogenetic ties to SARS-CoVs to test phytogeography.
These arbitrary constraints, lacking sequence data, introduce uncertainty and further weaken the reliability of their geographic inferences (p. 13).
37. Unreliable Diffusion Coefficients
The authors estimate diffusion coefficients (e.g., 1,666 km²/year) to model bat virus dispersal (p. 13).
These metrics, based on sparse sampling and untested assumptions, are merely hypothetical and overstate dispersal reliability (p. 16).
38. Prolonged MCMC Chains
The authors’ BEAST analyses require MCMC chains of millions of generations to achieve minimal convergence (p. 23).
Markov Chain Monte Carlo (MCMC) samples can indeed model parameters, but such lengths suggest computational instability, which in turn, undermines their inferences (p. 23).
39. Selective Genome Trimming
The authors trim genome ends due to sequencing challenges before NRR analysis (p. 20).
This selective removal of data, without assessing its impact, risks excluding variable regions, and biases phylogenetic results to fit their agenda (p. 20).
40. Weak Skygrid Prior
The authors use a “skygrid” prior (with 49 grid points) to model population dynamics in BEAST (p. 22).
This flexible prior, estimating population size changes, relies on sparse data, again reducing the accuracy of their evolutionary inferences (p. 16).
41. Arbitrary Breakpoint Merging
The authors merge GARD recombination breakpoints closer than 100 nucleotides (p. 20).
This arbitrary rule alters NRR boundaries, potentially skewing not only their phylogenetic analyses, but also their subsequent inferences (p. 20).
42. Limited Sensitivity Analyses
The authors’ sensitivity analyses, tweaking priors and adding civet genomes, fail to address their core data limitations.
These superficial tests therefore can not be used to validate their phylogenetic or phylogeographic conclusions (p. 22).
43. Flawed Bat Distribution Models
The authors use “Maxent” to model bat distributions, clipped to Southeast Asia (p. 14).
Yes, Maxent can be used to predict the presence of species, but sparse data & regional limits reduce the models’ accuracy for Sarbecovirus ecology (p. 16).
44. HMMCleaner Data Removal
The authors use HMMCleaner to remove poorly aligned genome segments before NRR analysis (p. 20).
Although HMMCleaner corrects certain alignment errors, this also risks discarding informative variation, which in turn biases the phylogenetic results (p. 20). And that is just another way of hoodwinking the media!
45. Weak Isolation-by-Distance Signal (1)
Isolation by distance (IBD; Slatkin, 1987, 1993) describes a pattern of population differentiation based on the “stepping-stone” model of population structure (Kimura and Weiss, 1964) in which genetic differences between populations increase with geographic scale.
46. Weak Isolation-by-Distance Signal (2)
The authors report a moderate isolation-by-distance (IBD) signal (Spearman correlations 0.45–0.67, p. 13).
IBD links genetic and geographic distance, but sparse sampling reduces the signal’s reliability for dispersal inferences (p. 16).
47. Selective Topology Adjustment
The authors collapse phylogenetic nodes with bootstrap support below 70 (p. 21).
This adjustment, simplifying tree topologies, masks any uncertainty and, of course, biases the subsequent inferences to favour their increasingly challenged zoonotic hypothesis (p. 21)
48. Adjusted Rate Priors
The authors reduce rate prior standard deviations for SARS-CoV-1 to aid their BEAST (ly) convergence (p. 23).
This adjustment lacks justification, and again leads to the skewing of clock rate estimates and phylogenetic outcomes (p. 23).
49. Overstated Deep-Time Estimates
The authors’ tMRCA estimates of thousands of years rely on flexible clock models (p. 11).
Without fossil or ecological validation, these deep-time inferences overstate Sarbecovirus evolutionary history (p. 11).
50. Statistical and Data Flaws Summarized
The authors rely on flawed GARD, BEAST, and PoW models, with arbitrary breakpoints, low ESS, and unvalidated priors (p. 7, 23).
Excluding pangolin/human data, trimming genomes, and cherry-picking NRRs only serve to bias their exaggerated zoonotic claims (p. 8, 20).
51. Methodological and Bias Concerns Summarized
Flimsy sensitivity tests, biased Maxent models & topology “tweaks” undermine the authors’ methods (p. 14, 22).
Inventing taxa and ignoring non-zoonotic origins, plus EcoHealth ties all combine to weaken the strength of their already rather weak claims (p.13, 15).
THE END…. OF PEKAR ET AL., (2025)