Elsevier

Ecological Informatics

Volume 53, September 2019, 100978
Ecological Informatics

An application of fuzzy logic to build ecological sympatry networks

https://doi.org/10.1016/j.ecoinf.2019.100978Get rights and content

Highlights

  • Sympatry networks could enrich biogeographic analysis but are difficult to compute

  • A novel and efficient approach for computing sympatry networks is proposed

  • The approach uses fuzzy logic and achieves great efficiency even for large datasets

  • Generated sympatry networks lead to underlying patterns on ecological datasets

  • Source code is freely released

Abstract

In recent years sympatry networks have been proposed as a mean to perform biogeographic analysis, but their computation posed practical difficulties that limited their use. We propose a novel approach, bringing closer the application of well-established network analysis tools to the study of sympatry patterns using both geographic and environmental data associated with the occurrence of species. Our proposed algorithm, SGraFuLo, combines the use of fuzzy logic and numerical methods to directly compute the network of interest from point locality records, without the need of specialized tools, such as geographic information systems, thereby simplifying the process for end users. By posing the problem in matrix terms, SGraFuLo is able to achieve remarkable efficiency even for large datasets, taking advantage of well established scientific computing algorithms. We present sympatry networks constructed using real-world data collected in Mexico and Central America and highlight the potential of our approach in the analysis of overlapping niches of species that could have important applications even in evolutionary studies. We also present details on the design and implementation of the algorithm, as well as experiments that show its efficiency. The source code is freely released and datasets are also available to support the reproducibility of our results.

Introduction

Networks are composed by nodes and edges between nodes, representing relations. They are a powerful tool to model and analyze systems, as they make it possible to visualize structure, find communities, measure the importance of specific nodes, identify patterns, and perform other tasks that provide a better understanding of the subject (Aggarwal and Wang, 2010; Newman, 2010). Dos Santos et al. (2008) proposed the idea of sympatry networks, where nodes represent species and an edge is drawn between two nodes whenever a spatial overlap between the species can be inferred from data point records, implicitly stating that spacial proximity may produce interactions between the species. Furthermore, authors extended the idea to include, by means of weighted edges, numerical values indicating the strength of the spatial association between species with overlapping areas (Santos et al., 2012). The problem of estimating similarity indices between species based on their geographical area of occupancy is not new; other researchers have tackled it using a network model that allowed them to identify communities and suggest units of co-occurrence that satisfied the criteria of within-group sympatry and between-group allopatry (Dos Santos et al., 2008).

The use of networks to study the connectivity structure among species in a geographical area could bring an insightful perspective to some ecological problems and allows the identification of biogeographical patterns. Broennimann et al. (2012) suggest that due to the importance of understanding the niche dynamics among species, it is necessary to quantify the degree of difference or similarity between and within the specie’ niches as an estimate of overlapping niches. Sympatry networks could be used as a method to determine niche overlapping matrices, based on environmental distribution of species, when considering a set of coordinates (x, y) defined, for example, in a multivariate ordination space.

Building a sympatry network is not a trivial task, as it poses at least two important challenges. First, it requires the comparison of individual species areas that can only be inferred through data points. Approaches to this problem include establishing grid cells, delimiting each area by drawing propinquity circles or convex hulls, using minimum spanning trees to connect each species' point records, and making a direct comparison between point sets through some distance measure (Morrone, 2009; Santos et al., 2012). However, whatever method is used, there is unavoidable uncertainty arising from the records themselves and from the process used to infer ranges from point locality data. Second, building the network requires analyzing each pairwise relation between species; for a study that involves N species, each associated to possibly hundreds of records, the number of indicators that must be computed is in the order of N2. Computational efficiency is needed as the number of species and data point records grow.

In this work we present an approach based on fuzzy logic that allows for a computationally-efficient method to generate ecological sympatry networks. Introduced in (Zadeh, 1965), fuzzy sets are a generalization of ordinary sets proposed for managing the imprecision, complexity and uncertainty found in real-world problems (Klir and Yuan, 1995). Several authors have brought attention to the suitability of using fuzzy logic in problems related to species geographic ranges, pointing out that it even resembles the way human observers make sense of the information to either infer presences or compare maps corresponding to different taxa (Barbosa, 2014; Hagen, 2003; Visser and Nijs, 2006). However, these authors use fuzzy operators when they have already established a matrix of presence-absence indicators — thus carrying the difficulty of representing individual species areas by defining the matrix in the first place (Fig. 1).

By contrast, our method uses a fuzzy point of view to approach the task of inferring a shared area from recorded observations. To do so, the algorithm relies on two key ideas: i) that each point record can be used as evidence of the species existence in the vicinity of said observation and that ii) it is, therefore, reasonable to consider the neighborhood points as belonging — to certain distance-dependent extent — to the species range. A fuzzy-logic point of view allows us to express the partial belonging of each area point to the species' range in the light of the evidence recorded.

Our contribution is therefore twofold. First, we propose an approach to estimate spatial association between species by assigning partial membership to points surrounding data records and then combining values by means of numerical integration. In this way, the membership assigned to each point in the plane depends on the point record density in its proximity; areas with the strongest evidence contribute more to the overall estimation, whereas isolated points are taken into account without significant impact in the result.

Second, we present the algorithm SGraFuLo (Sympatry Graph through Fuzzy Logic) that uses the previous ideas to estimate, with remarkable efficiency, the pairwise intersection area of different species on the sole basis of recorded locality points. The algorithm represents a different conceptual approach to the problem of creating sympatry networks and achieves significant better time than existing software for either creating sympatry networks using a non-fuzzy approach or using a fuzzy approach to estimate binary indices. In this way, the algorithm eases the modelling of shared areas by sympatry networks and facilitates the use of well-established network analysis tools in the study of biogeographical patterns.

Our work and experimental settings follow the general problem as posed by previous authors (Dos Santos et al., 2008; Torres-Miranda et al., 2013) and, therefore, our choice of fuzzy membership function and related parameters adheres to a general case in which there is no further information on environmental variables. However, it is worth mentioning that our algorithm can be easily tuned to incorporate contextual knowledge and enrich the model of a particular geographical area.

Because fuzzy theory is not as widely known as probability theory and some confusion could arise regarding the interpretation of our results, we start this work by briefly discussing how the fuzzy point of view differs from a probabilistic approach to the problem. Then, we move to the technical mathematical and numerical aspects used to develop our algorithm. Finally, we present some experiments showing the efficiency of SGraFuLo and the quality of the sympatry networks it computes. Our algorithm implementation and most of the datasets used in our experiments are fully available for reproducibility purposes.

Section snippets

Materials and methods

Fuzzy logic (Zadeh, 1965) is an extension of boolean logic that challenges the notion that an item either belongs or not to a set. Instead, fuzzy logic proposes that an item could partially belong to the set: a value of 0 indicates that the element is not at all in the set, a value of 1 means the element is definitely in the set and numbers in between indicate a partial degree of membership.

Because both deal with uncertainty, the difference between probability theory and fuzzy logic and between

Results

The ideas presented in previous sections were implemented using the C++ programming language. In this section we obtain sympatry networks using field data collected in Mexico and other areas in North and Central America. Except for a (new) still-unpublished dataset (Central), the data used in our experiments has been reported in Torres-Miranda et al., 2011, Torres-Miranda et al., 2013 and Ramírez-Toro et al., 2017a, Ramírez-Toro et al., 2017b and, like SGraFuLo, is fully available to ensure the

Discussion

We presented an algorithm, SGraFuLo, that uses a fuzzy-logic approach and numerical methods to determine the spatial association between any given pair of species starting only from data point representations of the species occurrences, i.e., the algorithm can work with geographical information as well as with environmental or climatic information associated to each point of occurrence. We described the mathematical foundations of the algorithm as well as details on its implementation. The

Acknowledgments

This project was supported by DGAPA - PAPIIT (UNAM) Grant number IA106316. All provided data was collected and updated by Andrés Torres Miranda, supported by DGAPA - PAPIIT (UNAM) Grant Number: IV201015 and IA208218. The authors would also like to personally thank Teresa Patiño Cárdenas, Adriana Menchaca Méndez, Sergio Rogelio Tinoco Martnez and Vctor Breña Medina for their inputs and support.

All authors contributed critically to the drafts and gave final approval for publication. All data was

References (35)

  • O. Broennimann et al.

    Measuring ecological niche overlap from occurrence and spatial environmental data

    Glob. Ecol. Biogeogr.

    (2012)
  • R. Brummitt

    World Geographical Scheme for Recording Plant Distributions

    (2001)
  • D.A. Dos Santos et al.

    Sympatry inference and network analysis in biogeography

    Syst. Biol.

    (2008)
  • F.L. Gall

    Faster algorithms for rectangular matrix multiplication

    Proceedings of the 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science

    (2012)
  • A. García-Mendoza

    Biodiversidad en Oaxaca, chapter Integración del conocimiento florístico del estado, p. 305–325

  • G. Guennebaud et al.

    Eigen v3

  • A. Hagen

    Fuzzy set approach to assessing similarity of categorical maps

    Int. J. Geogr. Inf. Sci.

    (2003)
  • Cited by (1)

    View full text