Skip to content. Skip to main navigation.
As a result of our increased ability to collect data from different sources, many real-world datasets are increasingly becoming multi-featured and these features can also be of different types. Examples of such multi-featured data include different modes of interactions among people (Facebook, Twitter, LinkedIn, telephone calls ...), traffic accidents associated with diverse factors (speed, light conditions, weather ...) or inter-connectivity among different networks like city-based airline, residence and collaboration networks. A multiplex is a network of networks that is ideal for analyzing these types of datasets by modeling each aspect separately into layers. Based on the type of entities, multilayer networks have been differentiated into two types - homogenous (multiple interactions among same type of entities) and heterogeneous (multiple interactions among different types of entities). Multiplexes have computational advantages over monoplexes (single graphs) for modeling and analyzing multi-source, multi-feature data. However, this multiplex-based modeling brings with itself a new set of challenges holistically analyzing such data. The literature focusses on overall multiplex diagnostics by either considering each layer individually or aggregating all the layers leading to loss of information, thus missing out on intricate details due to absence of ways that can analyze these multiplex-based representations in between these two extremes. We are aligned towards building a suite of efficient composition-based algorithms for flexible analysis of multilayer networks.
We have proposed the combination of homogeneous multilayer layers using Boolean operations - AND, OR, NOT. Clearly, for n features, a total of 2n layer combinations are possible causing the overall analysis process to become exponential with respect to both time and space. In this regard, we propsed an approach to re-create the communities of any AND-composed layer by just using the layer-wise communities, which has been empirically shown to provide an overall saving of more than 40% in computation time. We have been able to propose four heuristics based on degree and closeness centrality to estimate the hubs (or central entities) of any AND-composed layers, which when applied to real-world, multi-featured datasets such as IMDb and traffic accidents, provide an average accuracy of more than 70-80% while reducing the overall computational time by at least 30%. Currently, we are in the process of formulating techniques to re-construct the communities and hubs for OR-composed and NOT-composed layers. Moreover, we are working on obtaining a confidence interval on the accuracy and savings in computational time based on layer characteristics. In future, we plan to extend this work to weighted and directed edges as well. The development of community and hub reconstruction techniques using the primitive Boolean operators, will allow us to flexibly analyze other possible combinations of k layers like XOR, NOR, NAND and so on.
With respect to heterogeneous multiplexes the area of graph mining and querying has not been explored much. Further, the bipartite links between the layers makes the composition non-trivial. We are addressing the various challenges involved in mining frequent patterns and querying across inter-linked multiplex layers. Further, there is a need to quantify and develop effecient methods for generating the multilayer hubs and communities.