Complete analysis of BFS/DFS traversal, Dijkstra/Floyd shortest path, Prim/Kruskal minimum spanning tree, and topological sorting
Graph is one of the most versatile and powerful data structures in computer science. Human relationships in social networks, web page links on the Internet, cities and highways on maps, function calling relationships in compilers—these real-world problems can be naturally abstracted into graph models. Mastering graph algorithms is a key step from "Introduction to Programming" to "Advanced Algorithms".
This article will systematically explain the basic concepts of graphs, two core traversal methods (BFS and DFS), three shortest path algorithms (Dijkstra, Floyd-Warshall, Bellman-Ford), two minimum spanning tree algorithms (Prim, Kruskal), and topological sorting. All code is written in Python and fully commented.
Basic concepts of graphs
Directed and undirected graphs
- Undirected Graph: The edges have no direction and represent a two-way relationship. Such as "friend relationship" in social networks.
Weighted and unweighted graphs
- Unweighted graph: All edges have the same weight (usually considered 1).
How to store pictures
Adjacency Matrix: Use a two-dimensional array graph[i][j] to store edge information from node i to node j.
# 邻接矩阵表示(无向带权图)INF = float('inf')graph_matrix = [ [0, 4, INF, INF, INF], [4, 0, 8, INF, INF], [INF, 8, 0, 7, INF], [INF, INF, 7, 0, 9 ], [INF, INF, INF, 9, 0 ]]- 优点:查询任意两点是否相邻为 O(1)
- 缺点:空间 O(V²),对稀疏图浪费严重
邻接表(Adjacency List):每个节点维护一个列表,存储其所有邻居及边权。
# 邻接表表示(无向带权图)graph_list = { 0: [(1, 4)], 1: [(0, 4), (2, 8)], 2: [(1, 8), (3, 7)], 3: [(2, 7), (4, 9)], 4: [(3, 9)]}- Advantages: Space O(V + E), efficient neighbor traversal
Selection principle: Sparse graphs (E is much smaller than V²) use adjacency lists, and dense graphs (E is close to V²) use adjacency matrices. In actual projects, adjacency lists are the more common choice.
Graph traversal
Graph traversal is the basis of almost all graph algorithms and is divided into two classic methods: breadth-first search (BFS) and depth-first search (DFS).
BFS — Breadth First Search
Principle: BFS starts from the starting node and expands outward layer by layer. It visits all nodes with distance 1 first, then nodes with distance 2, and so on. BFS uses Queue to maintain the order of nodes to be visited.
Core Features: In an unweighted graph, the path found by BFS must be the shortest path.
from collections import deque def bfs(graph, start): """ 广度优先搜索 :param graph: 邻接表,graph[u] = [(v1, w1), (v2, w2), ...] 或 [v1, v2, ...] :param start: 起始节点 :return: BFS 遍历顺序 """ visited = set() queue = deque([start]) visited.add(start) order = [] while queue: node = queue.popleft() order.append(node) for neighbor in graph[node]: # 兼容带权和无权邻接表 v = neighbor[0] if isinstance(neighbor, tuple) else neighbor if v not in visited: visited.add(v) queue.append(v) return order # BFS 求无权图最短路径def bfs_shortest_path(graph, start, end): """ 在无权图中求从 start 到 end 的最短路径 """ visited = {start} queue = deque([(start, [start])]) while queue: node, path = queue.popleft() if node == end: return path for neighbor in graph[node]: v = neighbor[0] if isinstance(neighbor, tuple) else neighbor if v not in visited: visited.add(v) queue.append((v, path + [v])) return None # 不可达应用场景:
- 无权图的最短路径
- 层序遍历(二叉树的层次遍历本质就是 BFS)
- 连通分量检测
- 网络爬虫的广度抓取策略
DFS — 深度优先搜索
原理: DFS 从起始节点出发,沿着一条路径尽可能深入,直到无法继续时回溯到上一个分叉点。DFS 可以使用递归或显式栈实现。
def dfs_recursive(graph, start, visited=None): """ 深度优先搜索(递归实现) """ if visited is None: visited = set() visited.add(start) order = [start] for neighbor in graph[start]: return order def dfs_iterative(graph, start): """ Depth first search (stack implementation) """ visited = set() stack = [start] order = [] while stack: # Push the stack in reverse order to ensure that the access sequence is consistent with recursion return order **Application scenario:**- Connectivity judgment (whether the graph is connected)- Ring detection- Topological sorting- Path search and backtracking algorithm- Strongly connected components (Tarjan algorithm, Kosaraju algorithm) ### BFS vs DFS comparison | Features | BFS | DFS ||-----|-----|-----|| Data Structure | Queue | Stack/Recursion || Space complexity | O(V) (worst case storage of an entire layer) | O(V) (worst case recursion depth is V) || Shortest path | Guaranteed (unweighted graph) | Not guaranteed || Suitable for scenarios | Hierarchical traversal, shortest path | Connectivity, ring detection, topological sorting | ## shortest path algorithm ### Dijkstra's algorithm **Principle:** Dijkstra's algorithm is a classic greedy algorithm that solves the **single source shortest path** problem. It maintains a set of nodes with the shortest distance determined, selects the node with the smallest distance from the undetermined set each time, and uses this node to update the distance estimate of its neighbors. **Applicable conditions:** Edge weights must be **non-negative**. **Time complexity of optimized version of priority queue:** O((V + E) log V) ```pythonimport heapq def dijkstra(graph, start): """ Dijkstra 单源最短路径算法 :param graph: 邻接表,graph[u] = [(v, weight), ...] :param start: 起始节点 :return: dist(最短距离字典), prev(前驱节点字典,用于路径还原) """ dist = {node: float('inf') for node in graph} dist[start] = 0 prev = {node: None for node in graph} # 优先队列:(距离, 节点) pq = [(0, start)] while pq: d, u = heapq.heappop(pq) # 如果取出的距离已过时,跳过 if d > dist[u]: continue for v, weight in graph[u]: new_dist = dist[u] + weight if new_dist < dist[v]: dist[v] = new_dist prev[v] = u heapq.heappush(pq, (new_dist, v)) return dist, prev def reconstruct_path(prev, start, end): """根据前驱节点字典还原路径""" path = [] current = end while current is not None: path.append(current) current = prev[current] path.reverse() return path if path[0] == start else [] # 使用示例graph = { 'A': [('B', 4), ('C', 2)], 'B': [('A', 4), ('C', 1), ('D', 5)], 'C': [('A', 2), ('B', 1), ('D', 8), ('E', 10)], 'D': [('B', 5), ('C', 8), ('E', 2)], 'E': [('C', 10), ('D', 2)]} dist, prev = dijkstra(graph, 'A')print(f"A 到各点最短距离: {dist}")print(f"A 到 E 的最短路径: {reconstruct_path(prev, 'A', 'E')}")Floyd-Warshall 算法
原理: Floyd-Warshall 算法求解全源最短路径,即所有节点对之间的最短距离。核心思想是动态规划:依次考虑每个节点 k 作为中转点,看是否能缩短 i 到 j 的距离。
时间复杂度: O(V³)
适用场景: 图规模较小(V ≤ 几百)、需要全部节点对的最短路径、可处理负权边(但不能有负权环)。
def floyd_warshall(graph_matrix): """ Floyd-Warshall 全源最短路径 :param graph_matrix: V×V 的邻接矩阵,graph_matrix[i][j] 为边权,无边为 INF :return: dist 矩阵 """ V = len(graph_matrix) # 复制矩阵避免修改原数据 dist = [row[:] for row in graph_matrix] # Core triple loop: k is the transfer point return dist # Usage example ### Bellman-Ford algorithm **Principle:** The Bellman-Ford algorithm is also a single-source shortest path algorithm, but it can handle **negative-weighted edges**. The algorithm performs V-1 rounds of relaxation operations on all edges, and each round checks whether each edge can shorten the distance. If round V can still relax, it means there is a negative weight cycle in the graph. **Time complexity:** O(VE) ```pythondef bellman_ford(vertices, edges, start): """ Bellman-Ford 单源最短路径(可处理负权边) :param vertices: 节点列表 :param edges: 边列表,[(u, v, weight), ...] :param start: 起始节点 :return: dist 字典,若存在负权环返回 None """ dist = {v: float('inf') for v in vertices} dist[start] = 0 # V-1 轮松弛 for _ in range(len(vertices) - 1): for u, v, w in edges: if dist[u] != float('inf') and dist[u] + w < dist[v]: dist[v] = dist[u] + w # 第 V 轮检测负权环 for u, v, w in edges: if dist[u] != float('inf') and dist[u] + w < dist[v]: print("图中存在负权环!") return None return dist # 使用示例vertices = ['A', 'B', 'C', 'D', 'E']edges = [ ('A', 'B', -1), ('A', 'C', 4), ('B', 'C', 3), ('B', 'D', 2), ('B', 'E', 2), ('D', 'B', 1), ('D', 'C', 5), ('E', 'D', -3)]print(bellman_ford(vertices, edges, 'A'))最短路径算法对比
最小生成树
最小生成树(Minimum Spanning Tree,MST)是一棵包含图中所有顶点的无环连通子图,且所有边的权重之和最小。MST 在网络设计(如最低成本布线)中有重要应用。
Prim 算法
原理: Prim 算法从任意一个顶点出发,每次选择连接已选顶点集合和未选顶点集合之间权重最小的边,将对应的未选顶点加入集合。这是一种贪心策略。
时间复杂度: O((V + E) log V)(使用优先队列)
import heapq def prim(graph, start): """ Prim minimum spanning tree algorithm :param graph: adjacency list, graph[u] = [(v, weight), ...] :param start: starting node :return: MST’s edge list and total weight """ mst_edges = [] total_weight = 0 visited = set() # Priority queue: (weight, current node, source node) pq = [(0, start, None)] while pq and len(visited) < len(graph): if u in visited: visited.add(u) for v, w in graph[u]: return mst_edges, total_weight ### Kruskal algorithm **Principle:** Kruskal's algorithm sorts all edges by weight from small to large, and considers each edge in turn - if adding the edge will not form a cycle (judged by union search), add it to MST. **Time complexity:** O(E log E) (sorting is the main overhead) ```pythonclass UnionFind: """并查集(路径压缩 + 按秩合并)""" def __init__(self, n): self.parent = list(range(n)) self.rank = [0] * n def find(self, x): if self.parent[x] != x: self.parent[x] = self.find(self.parent[x]) # 路径压缩 return self.parent[x] def union(self, x, y): px, py = self.find(x), self.find(y) if px == py: return False # 已在同一集合,加入会形成环 # 按秩合并 if self.rank[px] < self.rank[py]: px, py = py, px self.parent[py] = px if self.rank[px] == self.rank[py]: self.rank[px] += 1 return True def kruskal(vertices, edges): """ Kruskal 最小生成树算法 :param vertices: 节点列表 :param edges: 边列表 [(u, v, weight), ...] :return: MST 的边列表和总权重 """ # 将节点映射为整数索引 v_map = {v: i for i, v in enumerate(vertices)} uf = UnionFind(len(vertices)) # 按权重排序 sorted_edges = sorted(edges, key=lambda e: e[2]) mst_edges = [] total_weight = 0 for u, v, w in sorted_edges: if uf.union(v_map[u], v_map[v]): mst_edges.append((u, v, w)) total_weight += w if len(mst_edges) == len(vertices) - 1: break # MST 已有 V-1 条边 return mst_edges, total_weight # 使用示例vertices = ['A', 'B', 'C', 'D', 'E']edges = [ ('A', 'B', 4), ('A', 'C', 2), ('B', 'C', 1), ('B', 'D', 5), ('C', 'D', 8), ('C', 'E', 10), ('D', 'E', 2)]mst, weight = kruskal(vertices, edges)print(f"最小生成树: {mst}")print(f"总权重: {weight}")Prim vs Kruskal 对比
拓扑排序
定义: 拓扑排序是对**有向无环图(DAG)**中所有顶点的一种线性排序,使得对于每条有向边 (u, v),u 在排序中出现在 v 之前。
应用场景: 课程先修关系、任务调度、编译依赖分析、包管理器的安装顺序。
Kahn 算法(BFS 方式)
原理: 维护所有入度为 0 的节点的队列。每次取出一个入度为 0 的节点,将其加入结果,并将其所有邻居的入度减 1。如果某个邻居的入度变为 0,加入队列。
from collections import deque def topological_sort_kahn(graph, vertices): """ Kahn algorithm implements topological sorting (BFS) :param graph: directed graph adjacency list, graph[u] = [v1, v2, ...] :param vertices: list of all vertices :return: Topological sorting result, if there is a cycle, return None """ # Calculate the in-degree of all nodes in_degree = {v: 0 for v in vertices} for u in graph: for v in graph[u]: in_degree[v] = in_degree.get(v, 0) + 1 #Enqueue all nodes with indegree 0 while queue: for neighbor in graph.get(node, []): # If the result length is not equal to the number of nodes, it means there is a cycle return result # Usage example: Course elective sequence **Time complexity:** O(V + E) ## Summary of time complexity of each algorithm | Algorithm | Time complexity | Space complexity | Purpose ||------|-----------|-----------|------|| BFS | O(V + E) | O(V) | Traversal, unweighted shortest path || DFS | O(V + E) | O(V) | Traversal, connectivity, ring detection || Dijkstra | O((V+E) log V) | O(V) | Single source shortest path (non-negative weight) || Floyd-Warshall | O(V³) | O(V²) | All-source shortest path || Bellman-Ford | O(VE) | O(V) | Single-source shortest path (negative weight) || Prim | O((V+E) log V) | O(V) | Minimum spanning tree (dense graph) || Kruskal | O(E log E) | O(V) | Minimum spanning tree (sparse graph) || Kahn topological sort | O(V + E) | O(V) | DAG linear sort | ## Summary: Algorithm selection in different scenarios The choice of graph algorithm depends on the specific needs of the problem: **Find the shortest path:**- Non-negative weight graph, single source → Dijkstra- Contains negative weight edges, single source → Bellman-Ford- Between all node pairs, small scale → Floyd-Warshall- Unweighted graph → Direct BFS **Find the minimum spanning tree:**- Dense graph → Prim- Sparse graph → Kruskal **Find the topological order:**- DAG → Kahn algorithm (BFS) or DFS postorder reversal **Detection ring:**- Undirected graph → DFS or union search- Directed graph → DFS (detection of back edges) or Kahn algorithm (failure of topological sort indicates cycle) **Check connectivity:**- Undirected graph → BFS/DFS or union search- Strongly connected components of directed graph → Tarjan algorithm or Kosaraju algorithm Graph algorithms are one of the richest and most widely used fields in algorithm learning. A solid grasp of these basic algorithms introduced in this article will lay a solid foundation for you to understand more advanced graph theory problems (such as network flow, binary matching, minimum cost flow, etc.). It is recommended to draw more pictures and manually simulate the algorithm execution process during the learning process, which is more conducive to in-depth understanding than simply looking at the code.