Creating Graphs and Graph Types
If you follow the previous blog post correctly, you should now have networkx and panda successfully installed on the system. It’s now time to create some graphs, but first a little theory.
Theory
Networks generally have two things in common; they have nodes and edges. Although there are multiple types of networks, all of which contain nodes (also known as vertices or points) and edges. Nodes can represent many things (people, computers, entities, e.t.c) and edges serve as bridges between them.
Just to clarify, the terms networks and graphs are interchangeable but you may find that I switch between the two. They are essentially the same thing.
Python Code
Creating a New Graph
Now it’s time to get things going! Let’s begin by importing networkx into the script.
>>> import networkx as nx
To initialise an empty graph with no nodes or edges, we use the built-in Graph object to get things going. This can be achieved as follows.
>>> G = nx.Graph()
Adding nodes
As an example, let’s create a simple graph where we want to add three people known as Alice, Ben and Charlie. This is as simple as…
>>> G.add_node("Alice")
>>> G.add_node("Ben")
>>> G.add_node("Charlie")
Rather than adding them one by one, we can use a shortcut and do it all in one go with the following line…
>>> G.add_nodes_from(["Alice", "Ben", "Charlie"])
To keep track of how many nodes we have in our network, we can use the following line to see what we’ve got
>>> G.nodes()nNodeView(('Alice', 'Ben', 'Charlie'))
G.nodes()
maintains a list of all the nerds that are featured in the graph. This is useful to know as we can use this function to keep track of what is already included.
Adding Edges
Now that we’ve got a graph that contains nodes, we can now start to link them together. In our case, an edge represents a friendship. If Alice and Ben are to be friends then we do the following…
>>> G.add_edge("Alice", "Ben")n
Much like before, there are one of two ways to add edges to the network. The first option involves creating a single edge (as shown above) or we can create multiple edges in one go using…
>>> G.add_edges_from([("Alice", "Ben"), ("Ben", "Charlie")])
In this example, we use the Python tuple data structure to form a list of edges. To see the newly created edges, we use the G.edges()
function.
>>> G.edges()nEdgeView([('Alice', 'Ben'), ('Ben', 'Charlie')])
Attributes
So far the graphs that we have created are quite basic and are lacking a lot of detail. Suppose we want to provide more information about a certain type of node or edge involved in the graph. Using our example, we might want to attach attributes to nodes to indicate a person’s age or we can add edge attributes to show how long that friendship has existed.
There are many different examples of how you can use attributes with the general idea is that they help to distinguish certain parts of the network. With networkx, we can easily attach attributes to edges and nodes to make things easier to understand.
Attributes are incredibly easy to set and read as they follow the key-value pair mapping. Initialising attributes are done when creating an edge. For example, if I wanted to document how long a friendship has lasted, set the following attribute.
>>> G.add_edge('Alice', 'Charlie', years_friends=9)
It’s essentially a variable. Reading values again is very straightforward…
>>> print(G['Alice']['Charlie']['years_friends'])
9
Weights
In social networks, it’s common to include a wait to document how many times an edge has been used between a pair of users. It’s essentially the sum of all interactions. This again is treated as an attribute using the ‘weight’ key.
Graph generators
There are many different ways to generate networks within networkx. This post guided you through the basic approach by manually adding nodes and edges. networkx includes various algorithms for generating different types of graphs. This may be useful if you want to simulate real-world data without going ahead and collecting any data.
Without going into too much detail, here are a few generators that networkx provides that are considered useful. A completed list can be found here .
- Complete graph : a generator which creates a fully connected graph given a specific number of notes
- Random : there are several variations for this approach however the general rule is then nodes and edges are added randomly according to a specific distribution.
- Paths : a graph which contains a chain-like structure where nodes and edges are arranged in linear pattern.
- Cycle : similar to a path with the final node connecting to the starting node.
Tree : nodes and edges are connected based upon a hierarchical structure. This can be used to represent inheritance or dependency.
Lattice : nodes and edges are arranged to resemble a grid light structure.
Graph types
One of the greatest things about networkx is the ability to create different types of networks. In the example featured in this blog post, we used a simple undirected graph using the nx.Graph
type. There are three other types that are commonly used in network analysis.
- DiGraph : much like a regular graph, a director graph does exactly what it says on the tin. It points in a certain direction from one mood to another. It kind of acts like a one-way flow to indicate a relationship/interaction which may not be reciprocated.
- MultiGraph : graphs can become incredibly complex and edges can take on different forms within the same graph. A multi graph is used to contain multiple edges between the same pair of notes. This may be useful if you are capturing interactions which take place on multiple occasions.
- MultiDiGraph : much like a MultiGraph but edges are directed. These can repeated and added in different directions.
Conclusions
In this blog post, we walked through the simple procedure of creating and populating a graph. The approach we took is incredibly simple but for more complex graphs we may want to consider alternative formats. This will be for another blog post.