NodeXL


NodeXL is a network analysis and visualization software package for Microsoft Excel 2007/2010/2013/2016. The free version contains network visualization and social network analysis features. The commercial version includes access to social media network data importers, advanced network metrics, and automation. It is a popular package similar to other network visualization tools such as Pajek, UCINet, and Gephi..

Codebase

NodeXL is a set of prebuilt class libraries using a custom Windows Presentation Foundation control. Additional.NET assemblies can be developed as "plug-ins" to import data from outside data providers. Currently-implemented data providers for NodeXL include Facebook, Twitter, Wikipedia, web hyperlinks, Microsoft Exchange Server.

Features

NodeXL is intended for users with little or no programming experience to allow them to collect, analyze, and visualize a variety of networks. NodeXL integrates into Microsoft Excel 2007, 2010, 2013 and 2016 and opens as a workbook with a variety of worksheets containing the elements of a graph structure such as edges and nodes. NodeXL can also import a variety of graph formats such as edgelists, adjacency matrices, GraphML, UCINet.dl, and Pajek.net.

Data Import

NodeXL Pro imports UCINet and GraphML files, as well as Excel spreadsheets containing edge lists or adjacency matrices, into NodeXL workbooks. NodeXL Pro also allows for quick collection of social media data via a set of import tools which can collect network data from e-mail, Twitter, YouTube, and Flickr. NodeXL requests the user's permission before collecting any personal data and focuses on the collection of publicly available data, such as Twitter statuses and follows relationships for users who have made their accounts public. These features allow NodeXL users to instantly get working on relevant social media data and integrate aspects of social media data collection and analysis into one tool.

Data Representation

NodeXL workbooks contain four worksheets: Edges, Vertices, Groups, and Overall Metrics. The relevant data about entities in the graph and relationships between them are located in the appropriate worksheet in row format. For example, the edges worksheet contains a minimum of two columns, and each row has a minimum of two elements corresponding to the two vertices that make up an edge in the graph. Graph metrics and edge and vertex visual properties appear as additional columns in the respective worksheets. This representation allows the user to leverage the Excel spreadsheet to quickly edit existing node properties and to generate new ones, for instance by applying Excel formulas to existing columns.

Graph Analysis

NodeXL Pro contains a library of commonly used graph metrics: centrality, clustering coefficient, diameter. NodeXL differentiates between directed and undirected networks. NodeXL Pro implements a variety of community detection algorithms to allow the user to automatically discover clusters in their social networks.

Graph Visualization

NodeXL generates an interactive canvas for visualizing graphs. The project allows users to pick from several well-known Force-directed graph drawing layout algorithms such as Fruchterman-Reingold and Harel-Koren. NodeXL allows the user to multi-select, drag and drop nodes on the canvas and to manually edit their visual properties. In addition, NodeXL allows users to map the visual properties of nodes and edges to metrics it calculates, and in general to any column in the edges and vertices worksheet.

Research

NodeXL has been used by news outlets like Foreign Policy to visualize the structure of conversations about political topics as well as organizations like the World Bank to analyze voting data. NodeXL has been used as an analytical tool in dozens of research papers in the social, information, and computer sciences as well as the focus of research in human computer interaction, data mining, and data visualization.

Resources