Developing a continental atlas of the distribution and trypanosomal infection of tsetse flies (Glossina species)

Background Tsetse flies (Genus: Glossina) are the sole cyclical vectors of African trypanosomoses. Despite their economic and public health impacts in sub-Saharan Africa, it has been decades since the latest distribution maps at the continental level were produced. The Food and Agriculture Organization of the United Nations is trying to address this shortcoming through the Atlas of tsetse and African animal trypanosomosis. Methods For the tsetse component of the Atlas, a geospatial database is being assembled which comprises information on the distribution and trypanosomal infection of Glossina species. Data are identified through a systematic literature review. Field data collected since January 1990 are included, with a focus on occurrence, apparent density and infection rates of tsetse flies. Mapping is carried out at the level of site/location. For tsetse distribution, the database includes such ancillary information items as survey period, trap type, attractant (if any), number of traps deployed in the site and the duration of trapping (in days). For tsetse infection, the sampling and diagnostic methods are also recorded. Results As a proof of concept, tsetse distribution data for three pilot countries (Ethiopia, Kenya and Uganda) were compiled from 130 peer-reviewed publications, which enabled tsetse occurrence to be mapped in 1266 geographic locations. Maps were generated for eight tsetse species (i.e. G. brevipalpis, G. longipennis, G. fuscipes fuscipes, G. tachinoides, G. pallidipes, G. morsitans submorsitans, G. austeni and G. swynnertoni). For tsetse infection rates, data were identified in 25 papers, corresponding to 91 sites. Conclusions A methodology was developed to assemble a geo-spatial database on the occurrence, apparent density and trypanosomal infection of Glossina species, which will enable continental maps to be generated. The methodology is suitable for broad brush mapping of all tsetse species of medical and veterinary public health importance. For a few tsetse species, especially those having limited economic importance and circumscribed geographic distribution (e.g. fusca group), recently published information is scanty or non-existent. Tsetse-infested countries can adopt and adapt this approach to compile national Atlases, which ought to draw also on the vast amount of unpublished information. Electronic supplementary material The online version of this article (doi:10.1186/s13071-015-0898-y) contains supplementary material, which is available to authorized users.


Background
Unicellular parasites causing African trypanosomoses are transmitted to humans and animals alike by the bite of infected tsetse flies (Genus: Glossina). The diseases they cause, named sleeping sickness in humans and nagana in animals, continue to affect public health and socio-economic development in vast areas of sub-Saharan Africa. The annual economic losses attributed to African animal trypanosomosis (AAT) are measured in billions of dollars [1], while 70 million people are considered at risk of contracting the human form of the disease, i.e. human African trypanosomosis (HAT) [2].
Despite the severe impact of the disease, trypanosomosis control suffered from serious neglect for a number of years [3]. More recently, important initiatives at the international level have tried to redress this, by setting ambitious targets for interventions against tsetse and trypanosomoses at the continental level. These include the Programme against African Trypanosomosis (PAAT), joining the efforts of the Food and Agriculture Organization of the United Nations (FAO), the World Health Organization (WHO), the International Atomic Energy Agency (IAEA) and the African Union, Interafrican Bureau for Animal Resources (AU-IBAR) [4]; successive resolutions of the World Health Assembly on the control and elimination of HAT [5,6]; and finally the decision of the AU Heads of State and Government to launch the Pan-African Tsetse and Trypanosomosis Eradication Campaign (PATTEC) [7].
Albeit with different emphases, all initiatives aimed at controlling or eliminating trypanosomoses should be informed by contemporary and accurate information on the geographic distribution of tsetse flies [8]. At a strategic level, large scale albeit relatively coarse cartographic representations should provide information on the occurrence and abundance of flies, on the number of species present in an area, and on the possible isolation of fly populations. These factors may have important implications on the selection of priority areas [9] as well as on the choice of the most cost-effective strategy of intervention [10]. At a local level, fine resolution entomological data are needed to plan and implement field activities, as well as for monitoring and evaluation purposes [11]. This is particularly true when the area-wide integrated pest management (AW-IPM) approach is adopted, whereby entire tsetse populations are targeted for control or to create and progressively expand tsetsefree zones [12,13].
Despite the overarching need for tsetse distribution datasets, the latest continental maps of tsetse distribution were assembled over three decades ago [14][15][16][17]. Since then, large-scale mapping endeavours have predominantly focused on statistical modelling as a convenient way to generate predictive maps. Different methods, including logistic regression [18] and discriminant analysis [19] have been used to model the historical tsetse distribution against a range of environmental variables. For these studies, attempts were made to gather more recent entomological information, but the bulk of the training datasets still relied on the maps generated in the 1970s by Ford and Katondo [14][15][16], whose gaps and limitations have always been acknowledged [19].
More recently, a number of studies on arthropod vectors of diseases have proved that systematic data assemblages from scholarly journals and grey literature can be a valuable tool to develop global maps [20]. In the field of African trypanosomoses, the increasing utilization of the Global Positioning System (GPS) and of Geographic Information Systems (GIS) underpinned a huge advance in sleeping sickness mapping, embodied in the Atlas of HAT [21] a WHO-led initiative jointly implemented with FAO within PAAT.
With a view to addressing the lack of up-to-date data on tsetse distribution at the regional and continental levels, FAO launched the Atlas of tsetse and AAT, an initiative jointly implemented with IAEA in the framework of PAAT. The Atlas aims to assemble geo-referenced information over the past 25 years on the occurrence of Glossina species and AAT, and to assist endemic countries in better managing, analyzing and sharing entomological and parasitological data for improved, evidence-based decision making.
The methodology to assemble the AAT component of the Atlas has been previously described [22]. The present paper focuses on the development of the tsetse component, and it presents results for Ethiopia, Kenya and Uganda.

Methods
The methodological framework for the development of the Atlas of tsetse and AAT has already been presented elsewhere [22]. In summary, publications and input data derived from the various sources are collated and stored in a centralised data repository. Subsequently, selected items of information are extracted from the sources, harmonized and entered into a geo-spatial database. In the process of data extraction, each data source is analysed independently by two persons.
For the Atlas of tsetse, the database is structured into three simple components devoted to data sources, geographical data and entomological information respectively. The entomological information is further divided into two sub-components: tsetse distribution and tsetse infection. The data abstraction protocol is described in detail in the Additional file 1. At the present stage of development, the Atlas is focusing on information published in peer-reviewed articles from scientific journals.

Data sources
Peer-reviewed scholarly articles containing spatiallyexplicit data on tsetse distribution and infection are identified through a systematic literature review [22]. Suitable publications are stored in the data repository (PDF format), and their details are recorded in the "Data sources" table of the database. These details include author(s), title, journal, year of publication, study country and URL. Papers are only taken into consideration if they contain entomological data collected from January 1990 onwards. The databases and catalogues searched as well as the search terms used are provided in Additional file 2.

Geographical data
The table for geographical data is designed to capture information on the locations of entomological interest, as described in the data sources. It includes the location name and its geographic coordinates (WGS84 datum), country, other administrative units and the source of the coordinates. The geo-referencing procedures are similar to those developed for the Atlas of HAT [23].
In the Atlas of tsetse, the basic mapping unit is the 'site', as described in the source papers. Although the vast majority of tsetse occurrence data are collected in the field by means of traps, the exact coordinates and position of individual traps are rarely reported in scientific papers. Normally, in each site for which tsetse occurrence data are reported and published, one or more traps will have been deployed. In the Atlas of tsetse, the coordinates of the site are meant to represent the average coordinates of the traps deployed in that site. The attribute ' AREA' enables the size of the zone represented by each point/site to be recorded; if the exact area is not reported in the paper nor can it be calculated accurately, the attribute ' AREA_TYPE' provides an approximate categorization ranging from '≤10 km 2 ' to '>1000 and ≤ 10,000 km 2 ' [22].
In the rare cases when trap-level data are made available in the scientific paper (e.g. by means of supplementary material), entomological data are aggregated at the 'site' level before being imported in the database.

Entomological data: tsetse distribution
The tsetse distribution table is where tsetse occurrence and abundance data are recorded. The table includes the survey period, the type of trap, the attractant used (if any), the number of traps deployed in the site, and the duration of trapping (in days). The database enables absence/presence to be recorded, and apparent densities (flies/trap/day) as well as the total number of flies captured, depending on the type of information available in the data source. Although much less common than stationary traps, fly-rounds are still used to survey tsetse distribution, and their results are therefore included in the Atlas.
If possible, tsetse data are recorded separately for the different species and subspecies. Otherwise, data are referred to 'Genus: Glossina'. The tsetse distribution table was constructed in such a way that data for different tsetse species/subspecies are recorded in separate records. As a result, several records can refer to the same survey (i.e. one record for each tsetse species/subspecies detected in the survey).
Information is included on the strategy of trap deployment (e.g. whether traps where placed randomly or, as it is most often the case, they were deployed in the most favourable environments for tsetse). Also, information on recent or ongoing interventions against tsetse is recorded. In this context, recent interventions are those implemented in the few years preceding the survey.
The tsetse distribution table was mainly conceived to record data collected in cross-sectional surveys. However, useful data on tsetse occurrence can also be extracted from longitudinal surveys, which are subsequently flagged as such in a dedicated column of the table.

Entomological data: tsetse infection
The tsetse infection table includes a unique identifier of the survey, the survey period, the sampling method, the number and species of tsetse flies examined, and the diagnostic method. Trypanosomal infections can be captured in the Atlas in terms of presence/absence, prevalence and number of infected flies. Information on the use of trypanocidal drugs in domestic animals is recorded, as this can affect tsetse infection rates. More details on diagnostic methods, sampling methods, and the species of trypanosomes considered in the Atlas are provided below.

Trypanosome species
All causative agents of African trypanosomoses (encompassing both AAT and HAT) are included in the Atlas (tsetse infection table). These include Trypanosoma vivax, T. congolense, T. brucei, T. simiae and T. suis. Sub-species (i.e. T. b. brucei, T. b. gambiense and T. b. rhodesiense) and sub-groups (i.e. T. congolense savannah, forest/riverine and Kenya coast/kilifi, and T. simiae Tsavo) are also recorded if molecular tools enabled this level of characterization. Furthermore, T. godfrey, which affects similar hosts as T. simiae, is included in the database, although it has never been detected in vertebrate hosts [24,25].

Diagnostic methods
The trypanosome infection rate is normally defined as the percentage of flies having mature (fully developed) trypanosome infections in the gut, proboscis and salivary glands [26]. Microscopic examination of these organs is the traditional method for determining the infection rate [27]. Infections in the proboscis only are categorised as subgenus Duttonella, while infections in both the proboscis and midgut are considered as Nannomonas. Where infections involve the salivary glands, the trypanosomes are classified as the subgenus Trypanozoon. If trypanosomes are present in the midgut only, the infection is considered as immature. In the Atlas, both mature and immature infections are included, although only mature infections are considered to contribute to the infection rate.
Depending on the authors, Duttonella infections can be reported as T. vivax-type or outright as T. vivax (while Nannomonas and Trypanozoon can be reported as T. congolense-and T. brucei-type respectively). As a rule, in the Atlas we reflect the naming used by the authors of the paper.
Dissection and microscopic examination are affected by inherent diagnostic limitations, in that they are not capable of distinguishing infections at the species/subspecies level. Furthermore, characterization of immature and mixed infections is not possible.
In recent years, molecular tools have been increasingly used to provide more accurate estimates of tsetse infection. These tools include DNA probes and polymerase chain reaction (PCR). Data obtained using these techniques are included in the Atlas, alongside other, less common techniques such as immunological and subinoculation methods.

Sampling methods
In the tsetse infection component of the Atlas, the field 'SAMPLING METHOD' enables the procedures used to collect and/or select the biological material that was subject to examination to be recorded. Details on these procedures are essential to properly interpret the results of the analysis (i.e. absence/presence, prevalence and number of infections).
Looking at the dissection and microscopic examination technique [27], it is necessary to know if all captured tsetse catches were examined, or only a random (or nonrandom) sample thereof. Also, it is important to report which potentially infected organs were examined (i.e. midgut, proboscis or salivary glands); if, for example, salivary glands were not analysed, it will not be possible to make inferences on the occurrence of T. brucei-type infections.
When molecular tools are applied, it is important to record whether the analysis was conducted on all captured tsetse, on a random sample thereof, or on parasitologically-positive flies only (e.g. tsetse whose trypanosomal infection status had already been confirmed by microscopic examination). In the latter case, note is taken on whether the molecular examination was conducted on mature infections, immature ones, or both. Importantly, molecular data on both mature and immature infections are recorded, the former more directly related to the risk of infection, the latter still useful to map the areas of circulation of the parasite. Furthermore, the field 'SAM-PLING METHOD' can be used to record whether molecular analysis was conducted on whole flies or on specific organs, and whether a specific set of primers was used on all samples, or only on a subsample thereof. An example of the latter is provided in [28], where primers for T. congolense Kenya coast/kilifi, T. simiae Tsavo (at the time recognized as T. congolense Tsavo) and T. godfreyi were applied only on samples that were negative to primers for T. brucei s.l., T. vivax, T. congolense savannah, T. congolense forest/riverine and T. simiae.

Tsetse distribution
To map tsetse in Ethiopia, Kenya and Uganda, a total of 130 peer-reviewed publications were identified and processed (full list of papers in Additional file 3). These contain spatially-explicit data collected between January 1990 and December 2014. The extracted entomological datasets were mapped in 1266 distinct geographic locations.
Historical records from Ethiopia, Kenya and Uganda report the occurrence of four additional species of the fusca group (i.e. G. fusca congolensis, G. nigrofusca hopkinsi, G. fuscipleuris, G. medicorum) and one species of the morsitans group (G. m. centralis) [29]. However, our review could not identify published data on the presence of these five species for the period January 1990 -December 2014.
In addition to the maps of reported presence (Fig. 1), maps of apparent density can also be generated from the Atlas. In Fig. 2 we provide the example of G. pallidipes and G. fuscipes fuscipes, which are reported from all three target countries. In these maps, absence of detection is also included, which corresponds to an apparent density of zero flies/trap/day.

Tsetse infection
Data on tsetse infection was identified in, and extracted from, 25 papers (full list in Additional file 5). Tsetse flies infected with AAT-causing trypanosomes were detected in 91 locations of Ethiopia, Kenya and Uganda (Fig. 3). The vast majority of studies (22) mainly relied on dissection and microscopic examination [26,27], while other diagnostic techniques included inoculation of mice, immunological methods (i.e. dot-ELISA) and molecular techniques (i.e. DNA probes and PCR). Figure 4 focuses on the distribution of tsetse infections as determined through microscopic examination. For this figure, a threshold was applied to the minimum number of flies examined (i.e. 30), and both presence and absence of detection were mapped. Infections are presented as T. vivax-type (a), T. congolense-type (b) and T. brucei-type (c), but they also include infections reported as Duttonella, Nannomonas and Trypanozoon (or as T. vivax, T. congolense and T. brucei).

Discussion
We have developed and tested a methodology to generate large-scale maps of the distribution and trypanosomal infection of tsetse flies. The results obtained in the three pilot countries (i.e. Ethiopia, Kenya and Uganda) demonstrate that the approach is capable of producing valuable information on the geographical distribution of all tsetse species of major medical and veterinary importance. At the same time, the peer-reviewed publications used as input failed to provide data on four species of the fusca group (i.e. G. fusca congolensis, G. nigrofusca hopkinsi, G. fuscipleuris and G. medicorum) and one species of the morsitans group (i.e. G. m. centralis), for which historical records of presence in the region are available [29]. This lack of data is primarily to be ascribed to the limited geographic distribution of these species in the three target countries [16]. Furthermore, for most species of the fusca group, forest habitat [30] and a low medical and veterinary significance [31] discourage collection and publication of occurrence data. It appears that generating updated maps for tsetse species of the fusca group will pose special challenges, in particular for those characterized by a circumscribed geographic distribution and limited epidemiological importance.
When looking at tsetse species of the fusca and to some extent of the morsitans group, a high sensitivity to human encroachment [32] might also be the cause of a possibly shrinking geographic distribution in the study region. However, recent as well as historical data are so scanty that confirming or refuting this plausible hypothesis is impossible at this stage.
Care must also be taken when trying to interpret differences in the apparent density of tsetse flies, as this can be affected by a very wide range of factors. Most of the influencing factors are recorded in the database underpinning the Atlas (e.g. survey period/season, type of trap, attractants or lack thereof, strategy of trap deployment and possible tsetse control). However, even in the presence of such ancillary data, the generation of consistent maps of the fly densities over large areas Fig. 1 The occurrence of Glossina species/subspecies in Ethiopia, Kenya and Uganda. Data collection period: January 1990 -December 2014. Data were extracted from peer-reviewed scientific papers appears as a challenging endeavour. To give just one example of the challenges, technical skills in deploying the traps can have a strong impact on catches, but it is impossible to account for this variable in the Atlas. Particular care must be taken not to consider the absence of capture in a particular trapping exercise as reflecting necessarily the absence of flies in a given area.
Concerning the component on tsetse infection, our review confirmed that this type of data are scarce when compared to those on trypanosomal infections in humans [21] or non-human vertebrates [22]. Nevertheless, this additional information layer can prove useful in assessing the risk of both nagana and sleeping sickness. It can also provide information on interspecific and spatial variations in vectorial capacity of certain species or populations, which is of relevance for both epidemiological studies and for control programmes.
The main constraint in the use of tsetse infection data is perhaps linked to the inherent diagnostic limitations of microscopic examination, which was the most commonly used technique during our 25-year study period. Beyond its known inability to distinguish immature and mixed infections, a number of studies have highlighted the limited accuracy of the methodology [28,33]. Arguably, the underestimation of T. brucei infections could be the most systematic and consequential of such inaccuracies [34].

Conclusions
An Atlas of tsetse for Africa has the potential to provide information to a range of tsetse control and elimination activities, thus contributing towards achieving the goals of the PATTEC initiative. Furthermore, a number of research activities that have relied so far on predicted tsetse distributions based on historical data (e.g. [35,36]) will benefit from more recent, harmonized, and spatially explicit entomological information.
Although national-level maps can be derived from peer-reviewed publications (Additional file 4), improvement of these outputs could be achieved by incorporating unpublished data, while taking care not to introduce inaccuracies in the process. The component of the Atlas dealing with unpublished information would arguably be best developed at the country level, through the creation of national Atlases. Activities in this direction are already ongoing in a few countries (e.g. Sudan, Mali, Ethiopia and Zimbabwe), and FAO and IAEA are committed to continuing to provide the necessary backstopping and guidance to national authorities. The two United Nations specialised agencies are mandated to provide technical support for the socio-economic development of areas affected by tsetse and trypanosomosis. With their wide range of training and technical assistance activities, many of which have a field data collection component [13,[37][38][39][40], they are well placed to assist countries in the development of such national Atlases.
Peer-reviewed publications are available and activities are ongoing to apply the presented methodology at the continental level. When finalised, output dissemination will be ensured through the PAAT Information System [41]. Furthermore, upon completion of the data assembly and geo-referencing exercise, it will be possible to use geospatial and environmental modelling to make predictions for those sites in which recent field data are not available.

Additional files
Additional file 1: Protocol for the abstraction of data on the distribution and trypanosomal infection of tsetse flies. Description: The protocol describes how data on the occurrence, apparent density and infection rates of tsetse fly species (Glossina s.s.) are assembled into a geospatial database within the framework of the initiative "the Atlas of tsetse and African animal trypanosomosis".