AWS EC2 Topology Graph 📊

Posted on Jan 8, 2024
tl;dr: See network layout of your EC2 Instances

Instance Topology Graph

Click for Demo!

What is the EC2 Topology API?

AWS Instance Topology gives you details on the interconnect between ec2 instances. It’s helpful to determine if all the nodes are co-located on the same spine and can even be used when trying to make scaling decision for large ML training runs or HPC jobs.

To call the API, you’ll need at least aws cli version 2.13.36. You can check the version with:

$ aws --version
aws-cli/2.15.8 Python/3.11.6 Darwin/22.6.0 exe/x86_64 prompt/off

Then you can run:

aws ec2 describe-instance-topology

This will return a JSON output similar to the following:

{
    "Instances": [
        {
            "InstanceId": "i-1111111111example",
            "InstanceType": "p4d.24xlarge",
            "GroupName": "my-ml-cpg",
            "NetworkNodes": [
                "nn-1111111111example",
                "nn-2222222222example",
                "nn-3333333333example"
            ],
            "ZoneId": "usw2-az2",
            "AvailabilityZone": "us-west-2a"
        },
        {
            "InstanceId": "i-2222222222example",
            "InstanceType": "p4d.24xlarge",
            "NetworkNodes": [
                "nn-1111111111example",
                "nn-2222222222example",
                "nn-3333333333example"
            ],
            "ZoneId": "usw2-az2",
            "AvailabilityZone": "us-west-2a"
        },
...

This is great but hard to visualize, hence why we’re adding D3 Graph to this output.

How do we interpret this data?

Instance Topology Graph

Each instance has a set of three network nodes, these are ordered from furthest to closest to the instance, so Layer 3 is closest to the instance. The best possible latency comes from two instances with the same Layer 3 switch.

Layer Network Node
Layer 3 (closest to instance) nn-3333333333example
Layer 2 nn-2222222222example
Layer 1 nn-1111111111example

A cluster entirely connected looks like:

Instance Topology Graph

A mostly disconnected cluster looks like:

Instance Topology Graph

A highly interconnected cluster can be accomplished by placing the instances into a placement group, See EFA Best Practices for more.

What is D3 Graph?

D3 Graph is a python library that lets us create our graph in python and then converts the results into HTML + CSS.

  1. Install dependencies (suggest doing in a virtualenv):
pip3 install d3graph
pip3 install scikit-learn
pip3 install seaborn
pip3 install boto3
  1. Create a python script instance-topology-graph.py
# Import library
from d3graph import d3graph, vec2adjmat
import boto3
from pprint import pprint

ec2 = boto3.client('ec2', region_name='us-west-2')

response = ec2.describe_instance_topology(Filters = [{'Name':'instance-type', 'Values':['p4de.24xlarge']}])

pprint(response.get('Instances'))

# Create example network
source = []
target = []
for instance in response.get('Instances'):
    # Layer 3 (closest to instance)
    source += [instance.get('InstanceId')]
    target += [instance.get('NetworkNodes')[2]]
    # Layer 2
    source += [instance.get('NetworkNodes')[2]]
    target += [instance.get('NetworkNodes')[1]]
    # Layer 1
    source += [instance.get('NetworkNodes')[1]]
    target += [instance.get('NetworkNodes')[0]]

pprint(source)
pprint(target)

# Convert to adjacency matrix
adjmat = vec2adjmat(source, target)

# # Initialize
d3 = d3graph()
# Proces adjmat
d3.graph(adjmat)
# Plot
d3.show()

# Make changes in node properties
d3.set_node_properties(color=adjmat.columns.values)
# Plot
d3.show(filepath='temp/')
  1. Next run it, changing region and instance type for your use case:
python3 d3.py

Voila! a file d3graph.html will get created in a temp/ directory. You can now open that file to see your instance topology.

What does Good look like?

comments powered by Disqus