Entomological collections were conducted in Garithe village, Kilifi County, along the Kenyan coast (Additional file 1: Fig. S1). This region is highly diverse, made up of dense forests, dry thorny bushes, savannah vegetation, and seasonal swamps of brackish water. There are two distinct rainy seasons: long rains which occur between April and July, and short rains between October and November . The site was targeted for collections based on previous studies that described An. merus in high density .
Mosquitoes were collected in 20 houses over six consecutive days during the month of November in 2019 (Additional file 1: Fig. S1). Collections were carried out using Centers for Disease Control and Prevention light traps (CDC-LT) both indoors and outdoors for each house from dusk (1800 h) to dawn (0600 h), whilst coordinates were collected using eTrex® 10 (Garmin, Kansas, United States of America). The indoor traps were set in houses where at least one person spent the night during the collection period. The outdoor traps were strategically set next to the livestock shed, and where livestock were absent, the trap was set approximately 5 m from the household selected for indoor sampling. The collected mosquitoes were identified morphologically in the field laboratory  and sorted by physiological stage and sex. All the Anopheles spp. were preserved individually in 1.5 ml microcentrifuge tubes containing silica pellets and transported to the KEMRI Wellcome Trust Research Programme (KWTRP) laboratory and stored at −80 °C.
Using sterile scalpel and forceps, the female Anopheles mosquitoes were dissected and separated into distinct body parts for different assays. The legs and wings were used for An. gambiae s.l. sibling species identification, the head and thorax for the Plasmodium falciparum infection status, and the abdomen of blood-fed mosquitoes for trophic pattern and preference analysis as described previously .
Anopheles gambiae s.l. sibling species identification
Genomic deoxyribonucleic acid (DNA) was extracted from the legs and wings of mosquitoes as described previously, with minor modifications . Briefly, the mosquito parts were transferred into 1.5 ml microcentrifuge tubes containing 50 µl of 20% Chelex and crushed using polypropylene pestles. The lysate was incubated at 100 °C while shaking at 650 revolutions per minute (rpm) using a ThermoMixer (Eppendorf, Hamburg, Germany). The solution was then centrifuged at 10,000×g for 2 min, and the supernatant was transferred to a new 1.5 ml microcentrifuge tube. This was repeated twice, and the DNA was stored at −80 °C.
Anopheles gambiae s.l. sibling species were identified using a previously described method using primers that target the intergenic spacer (IGS) region of the ribosomal DNA . The species were distinguished by their band sizes after running agarose gel electrophoresis as follows: 153 base pairs (bp) for Anopheles quadriannulatus, 315 bp for An. arabiensis, 390 bp for An. gambiae sensu stricto, 464 bp for Anopheles melas, and 466 bp for An. merus . Anopheles funestus complex sibling species were also identified using polymerase chain reaction (PCR) with primers targeting the internal transcribed spacer region 2 (ITS2) .
Blood meal analysis
The abdomens of the blood-fed female Anopheles mosquitoes were crushed in 50 µl of molecular-grade water using sterile polypropylene pestles. Thirty microlitres of the lysate was mixed with 500 µl of phosphate-buffered saline (PBS) and then used to determine the source of blood meal using direct enzyme-linked immunosorbent assays (ELISA) as described previously [28, 29], with slight modifications. The samples were tested against the anti-host immunoglobulin G (IgG): human, goat, bovine, and chicken. The results were read visually as described previously .
Plasmodium falciparum sporozoite analysis
The mosquito head and thorax were crushed in 100 µl of 1× PBS in 1.5 ml microcentrifuge tubes. Then, 10 µl of 10× saponin was added to the lysate. The solution was subsequently incubated at room temperature for 20 min, and then centrifuged at 20,000×g for 2 min, and the supernatant was discarded. The pellet was resuspended in 100 µl 1× PBS. The solution was then centrifuged at 2000×g for 2 min and the supernatant was discarded. The pellet was resuspended in 50 µl of 20% Chelex and DNA extracted using the procedure described above. Thereafter, the extracted DNA was used for SYBR Green real-time PCR (RT-PCR) assays using primers described in Hermsen et al. . Briefly, the RT-PCR reaction consisted of 7.5 µl of QuantiTect SYBR Green PCR master mix (Qiagen, Hilden, Germany), 7.5 µM of forward primer 5′-GTAATTGGAATGATAGGATTTACAAGGT-3′ and 7.5 µM of the reverse primer 5′-TCAACTACGAACGTTTTAACTGCAAC-3′, 2 µl of nuclease-free water, and 4 µl of the DNA. RT-PCR conditions were as follows: 95 °C for 10 min for HotStarTaq DNA Polymerase activation, 40 cycles of 95 °C for 30 s, 60 °C for 45 s, 68 °C for 45 s, and finally the melt curve phase: 95 °C for 15 s, 60 °C for 1 min, and 95 °C for 30 s.
The highly polymorphic TED region of TEP1 was amplified using the primers VB229 5′-TCAACTTGGACATCAACAAGAAG-3′ and VB004 5′-ACATCAATTTGCTCCGAGTT-3′ as described previously [19, 32]. Thereafter, the 1088 ± 1-bp amplicon was cleaned using the QIAquick PCR purification kit (Qiagen, Hilden, Germany) and eluted in 15 µl of DNase-free water. The samples were subjected to both Sanger and next-generation sequencing (NGS).
For the Sanger sequencing, the PCR amplicons were sequenced using the primers (VB004 and VB229), BigDye Terminator chemistry v 3.1 (Applied Biosystems, UK), and the reaction was run on an ABI 3730xl capillary sequencer (Applied Biosystems, UK). The resulting sequence chromatograms were curated, edited, and aligned using the CLC Main Workbench 7 (CLC Bio, Qiagen, Aarhus, Denmark).
For NGS, 75 cycles of paired-end sequencing were carried out on the MiSeq platform using the Nextera DNA Flex library preparation protocol and MiSeq reagent kit v 3 (Illumina, USA). The resulting reads were demultiplexed, and then FastQC v 0.11.9 was used to remove indexes, low-quality PhiX and adapter reads. The identity of the resulting reads was ascertained using both BLASTn v 2.11.0 and mapping onto the TEP1 references.
Multiple sequence alignment and sequence editing was conducted in AliView v 1.25 software, using the newly sequenced TEP1-TED sequences and those collated from GenBank . Maximum likelihood phylogenies were reconstructed using the TIM+F+I substitution model with four gamma categories (TIM+F+I+G4) as the best fitting model as inferred by jModelTest in iqtree v 1.6.9  and visualised using FigTree v 1.4.4.
Statistical analysis, visualisation, and mapping were performed using R software (v 4.1.0) . The data were further analysed using a generalised negative binomial regression model with a log link and robust errors, where the dependent variable was the number of mosquitoes collected, and the independent variables were site (indicator variable), day, and household using STATA v 17.0 (StataCorp, College Station, TX, USA). The human blood index (HBI) was calculated as the proportion of mosquitoes that had fed on humans divided by total blood meals tested for each species.
Haplotype statistics including the number of haplotypes, haplotype frequency, and haplotype configuration, and tests of natural selection including Tajima’s D , Fu and Li’s D, and Fu and Li’s F  were calculated using DNA Sequence Polymorphism (DnaSP) software .