NetConfer represents an ensemble of workflows designed to provide a structured approach for exploring multiple (biological) networks as well as comparison through several classical and innovative visualization techniques. While the backend algorithms are coded using Python with JavaScript and PHP for web components, D3.js (https://d3js.org/) and Cytoscape.js [18] have been used for the frontend modules. Networkx python library (https://networkx.github.io) and SNAP C++ library [19] components have been employed in the backend for reliable and standard computations of individual network parameters/properties. Given the “multi-network” processing objective of the platform, end-users are allowed to supply multiple network files (maximum number of 8 networks) as delimited edge lists (Additional file 2: Fig. S1). Various options like specifying the columns corresponding to the source and target, delimiter selection, and edge weight can be easily specified from the input form. It is pertinent to note that all the input network files in a given submission should be derived from a similar type of data like multiple networks of genes, proteins, and microbes. In addition, the compared networks are expected to share at least some nodes in common to obtain meaningful insights. Nevertheless, NetConfer results are not biased by the nature of data and the above recommendations are solely meant for optimal testing of various features of the tool.

Upon successful upload of networks, an interactive grouped bar chart-based preview of four global network properties (i.e., total nodes, total edges, cluster coefficient, and density) of all the submitted networks is generated, thereby enabling users to gather an initial idea of the input network structures (Additional file 2: Fig. S2). A status terminal in the user interface displays errors encountered while performing the background tasks. Users can modify labels and colors associated with each network by assessing the color and label maps in the preview (Additional file 2: Fig. S3). Upon proceeding, the networks are automatically clustered based on their overall similarity (using edge Jaccard index) and presented to users in the form of an interactive tree/dendogram for ease of selection and subsequent application of workflows and visualizations (Additional file 2: Fig. S4). The working area is provided to the end-users in the form of a personalized dashboard comprising of “Workflows Dashboard” and “Visualization Dashboard.” To start any analysis in Workflows or Visualizations Dashboard, users get the option of selecting two or more networks from the clustered tree. In addition to the checkbox-based selection feature on the tree, clicking on nodes of interest from the tree redirects users to the graphical representation of the given network, wherein users get to interact, analyze, and customize the node-specific network graph.

Job management system

NetConfer provides an efficient job management system comprising of following components:

Unique identifier-based task initiation and recording

Tagging of unique identifier to a personalized dashboard specific to the job and end-users

User specific local job history management

Search and access

No requirement for registration or sharing of personal information

Each submission on NetConfer application web server is assigned to a 10-character unique alpha-numeric code (termed as JOB ID) and is displayed to the user for record keeping. The same is also locally stored in the user (and browser)-specific “Job History” page of NetConfer, wherein the user can keep track of his/her personal submissions. Post-submission of data and generation of JOB ID, the user is redirected to a personal “Workflows” dashboard. This dashboard along with all associated modules can be accessed by using the provided unique JOB ID. The search and access can be accomplished through the dedicated Job History page or through the Job Search widget provided on “Home Page” and “Submissions” section of NetConfer. It is pertinent to note that NetConfer purges a job and its associated dashboard(s) after 7 days of completion of the job. This unique “Job Management System,” coupled with personalized dashboards, ensures that users are able to use the platform for tracking their tasks and re-assess their workflows without needing them to register or share of any personal information. In addition, the provision for live status terminal in NetConfer allows an end-user to get dynamic updates about status of the task being performed. This is coupled by providing information about the time taken for each task and is expected to enhance the user experience.

Workflows and modules

The available methodologies and visualizations in NetConfer can be broadly classified into two categories, each tagged to a dedicated personalized dashboard:

Analysis workflows

Visualization modules

Analysis workflows

This category of analyses provides five mutually exclusive, yet logically connected multiple network analysis workflows and is described below (Additional file 2: Fig. S5).

Assessing similarity of network components

Identification of common nodes and edges among a set of networks is one of the primary features considered while comparing multiple networks. Therefore, the first workflow has been designed to identify nodes as well as edges which are commonly shared between two or more of the selected set of networks. The results can be viewed using the following three ways:

The first method provides Venn diagrams augmented with interactive user operations to graphically display similarities and differences across user-chosen networks [20]. The overlapping regions correspond to the intersection of multiple networks, with the number depicting the number of common nodes/ edges among the intersecting networks. The list of common edges/nodes among two or more networks can be easily identified by clicking on the numbers displayed on the corresponding intersecting regions (Additional file 2: Fig. S6, S7).

The second method uses a powerful visualization technique called “upset plot” which displays the number of common nodes and edges across all combinations of networks [21]. The filled and connected circles in the lower part of the upset plot correspond to different combinations of the networks which are being considered, and the bar heights indicate the number of common edges of the corresponding combinations (Additional file 2: Fig. S8, S9). Clicking on any bar provides the corresponding common nodes/edges of the selected combination set in form of a list or circular graph. If an “upset plot” of nodes is selected, only the constituent nodes for the combination are highlighted in the circle graph. Selecting a combination bar from the “upset plot” of edges shows the resultant network as a circle graph of the constituent nodes and edges. The generated “upset plots” can be sorted either by the size of the combinations (combination cardinality) or by the sets (set cardinality) (Additional file 2: Fig. S10).

The third method uses the classical, hitherto highly interactive, customizable, and downloadable network diagrams to visualize exclusive and intersecting set of edges between selected set of networks. This functionality can be accessed through the visualization modules presented in Visualization Dashboard discussed in a later section.

Biological use. This workflow can enable clinicians/researchers to easily obtain an overall idea of the similarities/dissimilarities between the uploaded networks. If two or more networks are very similar with respect to the constituent nodes, the values in the intersecting regions of the Venn diagram will be helpful to quantify as well as view the same. For example, the common genes perturbed in the virulent and avirulent infections could be identified using the Venn diagram of nodes in case study 2 (described later in the manuscript). Events like network rewiring can also be inferred for an uploaded set of networks if their constituent nodes appear very similar but show significant differences in the Venn diagram of edges.

Identifying and comparing key nodes

One of the most common ways of comparing networks is by studying different properties (or centrality measures, also termed as local properties) of the nodes and global properties of the network (Additional file 2: Fig. S11). NetConfer provides a dedicated workflow for assessing and comparing various global and local properties of all selected networks. Some of the most useful (local) properties of nodes in a network covered by NetConfer are degree, betweenness centrality, hub and authority score, eccentricity, and eigenvector centrality [4]. The workflow allows tabulated and graphical analysis and tracing of various properties across selected set of networks. These properties can be viewed by sortable and searchable tables for comparing the node properties across different networks. Further, the table can be generated by either having the values of one of the properties of a node across different networks or having the different properties of all the nodes in one network. The first option is useful for understanding the changes in centrality measure across different networks. Additionally, NetConfer utilizes interactive parallel coordinates called “Delta Centrality” for providing an innovative way of viewing changes in various centrality measures (Additional file 2: Fig. S12). A user can choose a centrality measure of interest by using the radio button and highlight the values for one or more selected node using the tabulated summary by clicking on the desired node name. This feature is useful especially for comparing spatio-temporally ordered networks.

Biological use. The centrality measures of various nodes in a network can be used for identifying critical or key nodes in a network. For example, in a microbial association network, degree and betweenness might be useful to identify key nodes (“microbes”) that help in microbial communication. Nodes with high betweenness score are essentially key points of information flow and if removed can disrupt the whole network. Similarly, hub and authority scores have been proven to be useful in identification of essential proteins in protein-protein interaction networks.

Comparing shortest paths

NetConfer allows a user to perform a comparative analysis of shortest paths between a given pair of “source” and “target” node across the selected set of networks using a novel interactive layout (Additional file 2: Fig. S13). The layout not only allows comparing multiple shortest paths within a network, but also across a selected set of networks in an easy and intuitive way. In the figure, the colors correspond to different networks (there may be more than one shortest path between two nodes in a single network), and the numbers correspond to the order of the nodes in the shortest path. The source and target nodes (as chosen by the user in the workflow) are always positioned at the bottom and top of the graph, respectively. Using this visualization, NetConfer makes it easy to identify nodes which are consistently present along the shortest path (indicating the preferred nodes) between the “source” and “target” nodes across different networks.

Biological use. Identification of shortest paths is useful during analyses of a range of biological networks like metabolic pathway analysis [22], alterations in protein-protein interaction networks [23], and order of interaction cascades in transcription factor networks [24]. Nodes which are consistently present in the shortest path between the “source” and “target” across different networks are likely to play an important role in the dynamics of the system. For example, in case study 2, gene “NCAPG” forms an important connecting member for the multiple differential shortest paths identified between genes “BIRC5” and “ASPM” (details are described later in the manuscript).

Inferring and comparing community structures

This workflow can be used to find and compare communities in a selected set of networks using innovative plots and tables. NetConfer offers a novel way of tracking changes in community structures across a pair of networks. Additional file 2: Fig. S14 represents an example of a heatmap-embedded Sankey diagram-based community transition tracking utility of this workflow. In both the vertical axes, the communities (which are easily distinguishable by colors) along with their constituent member nodes are ordered in the descending order of their size. Using the “node to node” flow between the two vertical axes, changes in communities’ constituent can be tracked easily, thereby helping users in identifying not only communities which are conserved across networks, but also the ones which undergo reshuffling. Heatmap embeddings besides the nodes represent the three important centrality measures, i.e., degree, hub score, and betweenness (whose values have been rank normalized across the given pair of network). This feature allows easy identification of key nodes and tracking their fate in communities of the two networks being compared. Additionally, a tabulated summary of the “community shuffling” (with an intersection and Jaccard score of community similarity) is also presented for user convenience. The results are also depicted in a “comprehensively searchable” tabulated layout for enabling users in identifying the communities that comprise of nodes (or group of nodes) of their interest (Additional file 2: Fig. S15). Highlighting of the searched query further makes the results easy to comprehend. By default, the tabulated searchable and sortable layout enlists communities in the descending order of their size additionally coloring them based on the parent network. By clicking on the community in the table, a network visualization (described in detail under the “Visualization modules” section) tab pops up displaying the community as a subnetwork (Additional file 2: Fig. S15).

Biological use. Closely linked hubs of interacting nodes represent a network community. Such communities provide important insights into the functional components and organization pattern of a biological system. For example, in a microbial association network, these modular hubs may constitute groups of microbes interdependent on each other for various functions. Understanding the nature and change in communities can hence be of great biological significance for community engineering experiments, understanding functional potential, pathogen colonization, etc. Further, understanding the changes in community structure across various states of a system (represented as a network) might help in identification of crucial “drivers” of the change [1].

Analysis and comparison of network cliques

This workflow is designed to identify and compare “cliques” between a selected set of input networks (Additional file 2: Fig. S16). Results are provided in “searchable, sortable, highlight-enabled” tabulated framework, similar to the ones implemented in community workflow. Users can choose nodes of interest (or a combination thereof) to explore cliques across all the chosen networks for comparison. In order to aid visual analyses of the results, users have the option to view the individual cliques within the networks by simply clicking on the clique names. Like the previous workflow, users can track the members of the cliques in other networks as well, which is facilitated by coloring the member nodes as gray and keeping the non-member nodes with the initial color (at the start of analysis). Another additional feature of the visualization is the ability to click and drag the clique member nodes together, which eases the ability to view the member nodes and their connections (describes later as a part of the case study).

Biological use. Clique (closely knit subset of nodes in a graph such as dyads, triads, and tetrads) serve as useful indicators for identifying co-expressed genes, finding (and comparing) motifs, protein complexes, and functional modules from protein-protein interaction (PPI) networks and for understanding microbial symbiosis. These small subunits are similar to communities but are often more robust indicators of biological subunits in a system [25].

Visualization modules

Apart from the workflows for network comparison as described above, NetConfer allows users to visually compare the results in a variety of ways, as described below:

In the “Visualization Dashboard” (Additional file 2: Fig. S17), individual network can be viewed by clicking on the network names in the hierarchically clustered tree. The visualization offers users options to customize and interact with the network visualization using simple and intuitive operations like dynamic change of node, font and edge size, and network layouts. In addition, end-users can also overlay network properties like degree, betweenness, and coreness to proportionally size the nodes of the network. Along with the above module for viewing the networks individually, NetConfer also provides modules wherein users have the option to choose two or more networks and view subsets or supersets of the networks. All these modules provide customizable and interactive subsets/supersets. The modules and their utilities are described below:

Intersection visualization module. In this module, the edges which are common across all selected networks can be viewed. All the features applicable to the network view, as described above, can be applied to analyze the intersection network.

Exclusive visualization module. In this module, the edges present exclusively in each of the selected networks as compared to all the other selected networks can be visualized. Nodes are colored according to their presence in different selected networks and can be customized for size, font, and layout as well.

Union visualization module. Using this module, the nodes and edges present across all the selected networks can be visualized. To help users understand which network the nodes belong to, every node is colored like a pie chart, with the colors corresponding to the networks in which the node is present. In addition, the edges are also given multiple colors to identify the networks they are found in. For example, the node A in Additional file 2: Fig. S18 has 5 colors indicating its presence across all five networks, whereas node L has only one color, implying its presence only in one network (net 5). On the other hand, the edge between node D and node S is present in two networks as determined by the colors (net 3 and net 5). Similarly, since the edge between node S and node A has only one color, this node is present in one network (net 1).

Property mapped individual network visualization module. Visualization of the nodes along with their local properties like degree, betweenness, and closeness are often of interest to users. In order to facilitate this, NetConfer offers a network visualization wherein all the different properties of the nodes can be used to size the nodes. For example, in Additional file 2: Fig. S19, the different nodes of network 1 (selected from the dropdown) are sized based on their degrees. Hence, it would be easier to identify the important nodes as well as their connections. The layout can be modified using the network layout (random, grid, concentric, hierarchical, and degree sorted circular layout) and property modifier, and the view can be zoomed using the zoom-in and zoom-out buttons in the network view modifier.

Distance from the global union network. A simple yet useful utility to visualize the distance of the individual network from the global union is implemented in this module. The distances are calculated as Jaccard indices [26] of the nodes and edges of the respective networks from the global union and presented as a radar chart. The network names are displayed as dimensional anchors placed equidistant on the periphery of a circle. The points on the radial line connecting the center of the circle to each network represent the corresponding distance (node and edge displayed in orange and green color respectively) of that network from the global union (Additional file 2: Fig. S20).

A flowchart of all the steps associated with requirements and submission of a task/job to NetConfer is summarized in Fig. 2. Figure 3 provides a gallery of important visualizations demonstrating the various outputs of NetConfer.

Fig. 2
figure2

A flowchart summary of all the steps associated with requirements and submission of a task/job to NetConfer

A flowchart summary of all the steps associated with requirements and submission of a task/job to NetConfer

A graphical summary of various key visualizations and associated analyses possible using NetConfer. Panel 1 shows graphs pertaining to the Workflows Dashboard, while panel 2 depicts the graphical results offered by the Visualization Dashboard of NetConfer. In panel 1, figure (a) is a typical layout of the Edge-Jaccard index-based network dendogram containing selectable (checkbox) nodes; (b) represents the grouped bar chart-based “global property preview” generated, immediately after submission of various network files to NetConfer; (c) highlights the upset plot offered in NetConfer for set-similarity analysis (classical Venn diagrams are also offered as alternative); (d) represents the novel yet simple visualization approach for tracing shortest paths for a given pair of source and target nodes across networks of interest; (e) depicts the novel Sankey-heatmap coupled community transition visualization designed for tracing the changes in community memberships and centrality measure of the members between a given pair of networks; and (f) represents the tabulated visualization of communities observed in all uploaded networks, wherein nodes of interest can be searched, highlighted, and visualized in the main network as well. Cliques are also visualized using the same method(s), (g–k) represent visualization of the community of interest in various networks. In panel 2, figure (a) represents visualization of network graph in various layouts offered by NetConfer, (b) depicts the union visualization method adopted by NetConfer using pie-nodes and multi-colored edges, and (c) represents a radar chart showing relative distances of a set of selected networks from their union network

Network similarity calculation

NetConfer utilizes the edge similarity using Jaccard distance [26] that calculates the ratio of intersecting edges (between the compared pair) over their union. The all versus all edge distance is first calculated in a matrix which is then hierarchically clustered to generate a dendrogram available in the Python networkx module (https://networkx.github.io). A hierarchy of the individual networks is built by progressively merging clusters obtained using the pairwise distance measures. The resultant tree is displayed in the main “Workflow Dashboard” with each leaf as a checkbox. In addition to the edge Jaccard, the node Jaccard distance is also calculated and used to calculate the distance of a set of individual networks from their union. The output is displayed as a radar plot available in the “Visualization Dashboard.”

Shortest path calculation

Shortest path is the set of minimum edges required to connect a given “source” to a “target” node in a given network. It can be noted that multiple shortest paths can exist between one “source” and “target” node with some nodes serving as preferred intermediates. NetConfer uses Dijkstra’s algorithm [27] implementation in the Python networkx module for this purpose. The algorithm works by generating a shortest path tree with the source node as root and proceeds using two sets, one containing a track of the nodes used in the shortest path and remaining nodes in the other. In every step, a vertex from the other set having the least distance from the source is identified and added to the path. For multiple input networks, NetConfer stores all the path information and displays them together as a path matrix. The layout of all shortest paths is designed in such a manner that the user-specified source and target nodes are positioned at the bottom and top of the path (respectively), and all other nodes between them are numbered in the order they appear in the path (starting from source, which is assigned the order number 0). Consequently, the number on the top (pertaining to the target node) also indicates the total path length.

Community detection

NetConfer implements the “fast modularity maximization algorithm” [28] to identify communities in an input network. This algorithm has been proven to work efficiently even with larger input networks. We used the well-known SNAP library [19] for community detection. The modified C++ implementation from the SNAP library and in-house python codes were used in our platform to enable batch calculations. The same library is also used for calculating various graph properties in NetConfer. As opposed to other implementations, this community detection algorithm works using a greedy optimization on the modularity using sophisticated data structures. A dendrogram of hierarchically decomposed communities is first created with leaves as the vertices of the original network and internal nodes representing the joins. The algorithm minimizes needless operations in storing the original data matrix and achieves a dramatic improvement in speed when compared to other implementations. A more detailed and technical description of the algorithm is provided by the authors in the original publication [28].

Community transitions

The identified communities can be compared for their similarities in the constituent nodes between a pair of networks. Each community in one network may split into multiple communities in the other network. In order to quantify this “community transition,” an individual Jaccard score as well as intersection count is first calculated across all communities between the two networks and presented as a tabulated summary under “community transitions” tab in “workflow 4.” Two other scores, namely sum Jaccard and weighted sum Jaccard, are also calculated to quantify the overall transition process. As the name implies, the sum Jaccard score is a cumulative sum of all the individual Jaccard scores, while its weighted version is calculated by dividing the value with total community comparisons made and multiplying the result with 100. Given that the Jaccard score calculates the “intersection over union” values, it is imperative that a higher value would indicate a higher intersection and hence higher similarity. Hence, a lower weighted Jaccard score indicates a higher amount of community transitions.

Clique finding

Similar to the community detection implementation, NetConfer implements the C++ function of clique finding available in the SNAP library (https://snap.stanford.edu/). The function enumerates maximum cliques within a reasonable time for a given network. The clique finding problem refers to finding the complete subgraphs of a given size (usually denoted as “k”) and subsequently finding whether any other cliques of higher size exist in the input graph. When a user sets the “k” value in the “workflow 5,” all cliques of size ≥ k are calculated and displayed as a table. The tabulated result is stored using an efficient JavaScript data structure called “DataTables” (https://datatables.net) in order to perform quick searches with multiple filters.

This content was originally published here.