Shortest Paths in DAGs and Scheduling Problems

Author

Your Name

Published

January 30, 2025

Introduction

This lecture focuses on shortest path problems in Directed Acyclic Graphs (DAGs) and their application to scheduling problems. We will begin by briefly reviewing Dijkstra’s algorithm and Breadth-First Search (BFS) from the previous lecture, setting the stage for understanding more efficient algorithms tailored for DAGs. The core of this lecture will cover topological sorting, a fundamental concept for DAGs, and a dynamic programming approach to find shortest paths in DAGs. Finally, we will explore how these concepts are applied to solve scheduling problems, including the Critical Path Method (CPM) for project management.

Review of Previous Lecture

In the previous lecture, we discussed algorithms for finding shortest paths in graphs.

Dijkstra’s Algorithm

Dijkstra’s algorithm is a classic algorithm used to find the shortest paths from a single source node to all other nodes in a graph, provided all edge weights are non-negative.

Time Complexity Analysis

The time complexity of Dijkstra’s algorithm depends on the implementation.

Heap-based Implementation: Using a binary heap to prioritize nodes, Dijkstra’s algorithm can achieve a time complexity of $O(M \log N)$, where $N$ is the number of nodes and $M$ is the number of edges in the graph. This implementation is efficient for sparse graphs.
Array-based Implementation: In a simpler array-based implementation, where we linearly search for the node with the smallest distance at each step, the time complexity is $O(N^2)$. This can be more efficient for dense graphs.

Shortest Paths with Unit Edge Weights

When all edge weights are equal to one (unit edge weights), we can use a more specialized and efficient algorithm.

Breadth-First Search (BFS)

Breadth-First Search (BFS) is ideally suited for finding shortest paths in graphs with unit edge weights. Starting from the source node, BFS explores the graph layer by layer, guaranteeing that the first time a node is reached, it is through a shortest path in terms of the number of edges. BFS uses a queue data structure to manage the nodes to be visited, achieving a time complexity of $O(N + M)$.

Shortest Paths in Directed Acyclic Graphs (DAGs)

This section focuses on finding shortest paths in Directed Acyclic Graphs (DAGs).

Significance of DAGs

DAGs are significant because they represent dependencies and precedence relationships without cycles. This property is crucial in various scenarios, such as task scheduling, dependency resolution in software projects, and project planning.

Absence of Negative Cycles

A key characteristic of DAGs is that they inherently cannot contain cycles, and therefore, they certainly cannot contain negative cycles. This is important because the presence of negative cycles can make the shortest path problem ill-defined or computationally harder in general graphs. In DAGs, we are guaranteed to find shortest paths efficiently.

Topological Sorting

Topological sorting is a linear ordering of vertices such that for every directed edge from vertex $u$ to vertex $v$, vertex $u$ comes before vertex $v$ in the ordering. Topological sorting is possible if and only if the graph is a DAG.

Definition and Properties

A topological sort of a DAG $G = (V, E)$ is a linear ordering of its vertices such that for every directed edge $(u, v) \in E$, $u$ comes before $v$ in the ordering.

A key property of DAGs is the existence of at least one source node (a node with no incoming edges). Topological sorting algorithms typically start from source nodes.

Algorithm for Topological Sort

$in\_degree \leftarrow$ Compute in-degree for each vertex $v \in V$ $Q \leftarrow$ Queue of all vertices with $in\_degree = 0$ $L \leftarrow$ Empty list to store sorted vertices $u \leftarrow$ Dequeue from $Q$ Append $u$ to $L$ $in\_degree[v] \leftarrow in\_degree[v] - 1$ Enqueue $v$ into $Q$ $L$ "Graph has a cycle"

Illustrative Example

Consider the following DAG:

Applying topological sort, one possible ordering is: 1, 2, 4, 3, 6, 5, 7, 8. After relabeling the nodes according to this topological order, we get:

In this topological ordering, every edge $(u, v)$ goes from a smaller index $u$ to a larger index $v$.

Non-Uniqueness of Topological Order

It is important to note that topological sorting is not unique. If there are multiple source nodes at any step, the order in which they are processed can lead to different valid topological sorts.

Shortest Path Algorithm for DAGs

We can efficiently find shortest paths in DAGs using dynamic programming, leveraging the topological order of the vertices.

The ‘len’ Array

We use an array, denoted as ‘len’, where ‘len[i]’ will store the length of the shortest path from a source node $S$ to node $i$. We assume, without loss of generality, that $S$ is the first node in the topological sort.

Initialization of ‘len’

Initialize the ‘len’ array as follows:

$\len[S] = 0$, where $S$ is the source node.
$\len[i] = \infty$ for all other nodes $i \neq S$. (Alternatively, we can calculate values progressively without explicit initialization to infinity for nodes we haven’t reached yet in the topological order).

If we are interested in paths from a specific source $S$ to a target $T$, and the topological sort indices range from 1 to $N$, we are concerned with the portion of the ‘len’ array from index $S$ to $T$. Nodes with indices less than $S$ are unreachable from $S$ in a topologically sorted DAG, and nodes with indices greater than $T$ are not relevant if we are only interested in paths up to $T$.

Dynamic Programming Approach

We process the nodes in topological order. For each node $i$ from $S+1$ to $T$, we calculate $\len[i]$ using the following recurrence relation:

\[\len[i] = \min_{(j, i) \in E} \{ \len[j] + c(j, i) \}\] where $c(j, i)$ is the cost of the edge from node $j$ to node $i$. This formula considers all incoming edges to node $i$. Since we are processing nodes in topological order, when we calculate $\len[i]$, the values $\len[j]$ for all predecessors $j$ of $i$ are already computed.

Time Complexity Analysis

For each node $i$, we iterate through its incoming edges. The total time complexity is the sum of the in-degrees of all nodes from $S$ to $T$. In the worst case, if we consider all nodes in the DAG, the total time complexity is proportional to the sum of all in-degrees, which is equal to the number of edges $M$. Therefore, the time complexity of this algorithm is $O(N + M)$, which is linear in the size of the graph, assuming we include the time to initialize and iterate through nodes.

Detailed Example

Let’s apply the shortest path algorithm to the DAG example we saw earlier, aiming to find the shortest path from node 1 to node 8.

Step-by-Step Calculation

We initialize $\len[1] = 0$ and consider nodes in the topological order we found earlier: 1, 2, 4, 3, 6, 5, 7, 8.

Node 1: $\len[1] = 0$.
Node 2: Predecessor is 1. $\len[2] = \len[1] + c(1, 2) = 0 + 3 = 3$. Parent of 2 is 1.
Node 4: Predecessor is 2. $\len[4] = \len[2] + c(2,4)$. Assuming edge (2,4) exists in the original graph, though not shown in the diagram. Let’s assume c(2,4) = 10 for this example. Then $\len[4] = 3 + 10 = 13$. Parent of 4 is 2.
Node 3: Predecessor is 2. $\len[3] = \len[2] + c(2, 3) = 3 + 4 = 7$. Parent of 3 is 2.
Node 6: Predecessors are 2 and 3.
- From 2: $\len[2] + c(2, 6) = 3 + 8 = 11$.
- From 3: $\len[3] + c(3, 6) = 7 + 5 = 12$.
$\len[6] = \min(11, 12) = 11$. Parent of 6 is 2.
Node 5: Predecessors are 4 and 3.
- From 4: $\len[4] + c(4, 5) = 13 + 4 = 17$.
- From 3: $\len[3] + c(3, 5) = 7 + 8 = 15$.
$\len[5] = \min(17, 15) = 15$. Parent of 5 is 3.
Node 7: Predecessors are 6, 5 and 4.
- From 6: $\len[6] + c(6, 7) = 11 + 10 = 21$.
- From 5: $\len[5] + c(5, 7) = 15 + 3 = 18$.
- From 4: $\len[4] + c(4,7) = 13 + 10 = 23$.
$\len[7] = \min(21, 18, 23) = 18$. Parent of 7 is 5.
Node 8: Predecessors are 7 and 1.
- From 7: $\len[7] + c(7, 8) = 18 + 1 = 19$.
- From 1: $\len[1] + c(1, 8) = 0 + 20 = 20$.
$\len[8] = \min(19, 20) = 19$. Parent of 8 is 7.

Thus, the shortest path length from node 1 to node 8 is 19.

Reconstructing the Shortest Path

To reconstruct the shortest path, we backtrack from the destination node (8) to the source node (1) using the parent pointers we recorded during the calculation.

Start at node 8. Parent is 7.
From node 7. Parent is 5.
From node 5. Parent is 3.
From node 3. Parent is 2.
From node 2. Parent is 1.
We reached the source node 1.

Reversing this sequence, the shortest path is $1 \rightarrow 2 \rightarrow 3 \rightarrow 5 \rightarrow 7 \rightarrow 8$ with a total length of 19.

Application: Scheduling Problems

Directed Acyclic Graphs have significant applications in scheduling problems.

Introduction to Scheduling

Scheduling problems involve managing jobs and resources to optimize certain objectives, often under constraints.

Jobs and Resources

In scheduling, we have a set of jobs or tasks that need to be processed using available resources, such as machines, processors, or personnel.

Precedence Constraints

Precedence constraints define the order in which jobs must be executed. For example, job A must be completed before job B can start. These constraints are naturally represented by directed edges in a graph, where an edge from A to B indicates that A must precede B.

Modeling Scheduling with DAGs

DAGs are a natural and effective way to model scheduling problems with precedence constraints.

Nodes Representing Jobs/Tasks

In a DAG scheduling model, each node in the graph represents a job or task that needs to be performed.

Edges Representing Precedence Relations

A directed edge from node $i$ to node $j$ signifies that job $i$ must be completed before job $j$ can start.

Job Processing Times

Each job $j$ is associated with a processing time $P_j$, which is the duration required to complete the job.

Scheduling Constraints

Besides precedence constraints, scheduling problems may involve other types of constraints.

Precedence Constraints

As discussed, these are the core constraints in many scheduling problems, dictating the order of job execution.

Release Dates

A release date $r_j$ for job $j$ specifies the earliest time at which job $j$ can start its processing. Job $j$ cannot start before time $r_j$.

Deadlines

A deadline $d_j$ for job $j$ specifies the latest time by which job $j$ must be completed. Job $j$ must finish by time $d_j$.

Critical Path Method (CPM)

The Critical Path Method (CPM) is a project management technique used to schedule, organize, and coordinate tasks within a project. It is particularly useful for projects with well-defined tasks and precedence constraints.

Adding Fictitious Start and End Nodes

In CPM, it is often helpful to introduce fictitious start and end nodes. A start node (node 0) represents the beginning of the project, and an end node (node $n+1$) represents the completion of the project. These nodes have zero processing time. Edges are added from the start node to all jobs that have no predecessors, and from all jobs that have no successors to the end node.

Earliest Starting Time (ET)

The Earliest Starting Time ($\ET[i]$) for a job $i$ is the earliest possible time at which job $i$ can begin, considering all precedence constraints.

Calculation as a Longest Path Problem

In CPM, to find the earliest starting time, we treat the problem as a longest path problem in the DAG. The $\ET[i]$ for each job $i$ is the length of the longest path from the start node (0) to node $i$. We can use a similar dynamic programming approach as for shortest paths, but instead of minimizing, we maximize:

\[\ET[i] = \max_{(j, i) \in E} \{ \ET[j] + P_j \}\] with $\ET[0] = 0$. The edge weights in this longest path calculation are the processing times of the predecessor jobs.

Makespan Definition

The Makespan ($\makespan$) is the total time required to complete all jobs in the project. It is the completion time of the last job to finish. In CPM, the makespan is given by the earliest starting time of the fictitious end node, $\makespan = \ET[n+1]$.

Latest Starting Time (LT)

The Latest Starting Time ($\LT[i]$) for a job $i$ is the latest possible time at which job $i$ can start without delaying the project completion time (makespan).

Calculation of LT

To calculate $\LT[i]$, we work backward from the end node ($n+1$). We set $\LT[n+1] = \makespan$. Then, for each job $i$, we calculate $\LT[i]$ as:

\[\LT[i] = \min_{(i, j) \in E} \{ \LT[j] - P_i \}\] This calculation ensures that if job $i$ starts at time $\LT[i]$, all subsequent jobs can still be completed without exceeding the makespan.

Slack Time

The Slack Time (or float) for a job $i$ is the amount of time that job $i$ can be delayed without delaying the project makespan. It is calculated as:

\[\text{Slack}[i] = \LT[i] - \ET[i]\]

Identifying Critical Jobs and Paths

A critical job is a job with zero slack time ($\text{Slack}[i] = 0$). These jobs must start on time to avoid delaying the project. A critical path is a path from the start node to the end node consisting only of critical jobs. The critical path determines the minimum project completion time (makespan).

Illustrative Example: Document Preparation

Consider the example of preparing a document, such as a thesis.

Task Decomposition

Tasks involved in document preparation might include:

Write Section A
Write Section B
Print Section A
Print Section B
Bind A + B
Buy Envelope
Write Address
Package
Deliver to Secretary
Mail Package

Precedence Graph Construction

We can construct a precedence graph based on dependencies. For example, "Print Section A" must come after "Write Section A". "Bind A + B" must come after both "Print Section A" and "Print Section B". We also need an edge from "Bind A+B" to "Package".

Application of CPM

By assigning processing times to each task and applying CPM, we can determine:

The earliest start time for each task.
The latest start time for each task without delaying the completion.
The critical path, which identifies the sequence of tasks that directly impacts the project completion time.
The makespan, which is the minimum time to complete the document preparation.

This analysis helps in identifying critical tasks that need close monitoring and resource allocation to ensure timely project completion.

Conclusion

In this lecture, we explored efficient algorithms for finding shortest paths in Directed Acyclic Graphs (DAGs) using topological sorting and dynamic programming. We saw that DAGs, due to their acyclic nature, simplify the shortest path problem and allow for linear-time solutions. We then applied these concepts to scheduling problems, introducing the Critical Path Method (CPM). CPM helps in project management by identifying critical tasks and paths, enabling efficient resource allocation and project planning.

Key takeaways from this lecture include:

DAGs are a special class of graphs that simplify shortest path calculations.
Topological sorting is a fundamental algorithm for DAGs, enabling efficient processing.
Dynamic programming provides a linear-time algorithm for shortest paths in DAGs.
CPM is a powerful application of DAGs and longest path algorithms for project scheduling and critical task identification.

Further questions to consider:

How can we handle more complex scheduling constraints, such as resource limitations or job priorities?
How do these methods extend to project management scenarios with uncertainties in task durations?
What are other applications of shortest and longest path algorithms in DAGs beyond scheduling?

This concludes our lecture on shortest paths in DAGs and scheduling problems.

--- title: "Shortest Paths in DAGs and Scheduling Problems" author: "Your Name" date: "2025-01-30" format: html: toc: true # Table of Contents toc-depth: 2 code-tools: true theme: cosmo # Or "journal" for Distill-like minimalism --- # Introduction This lecture focuses on shortest path problems in Directed Acyclic Graphs (DAGs) and their application to scheduling problems. We will begin by briefly reviewing Dijkstra's algorithm and Breadth-First Search (BFS) from the previous lecture, setting the stage for understanding more efficient algorithms tailored for DAGs. The core of this lecture will cover topological sorting, a fundamental concept for DAGs, and a dynamic programming approach to find shortest paths in DAGs. Finally, we will explore how these concepts are applied to solve scheduling problems, including the Critical Path Method (CPM) for project management. # Review of Previous Lecture In the previous lecture, we discussed algorithms for finding shortest paths in graphs. ## Dijkstra's Algorithm Dijkstra's algorithm is a classic algorithm used to find the shortest paths from a single source node to all other nodes in a graph, provided all edge weights are non-negative. ### Time Complexity Analysis The time complexity of Dijkstra's algorithm depends on the implementation. - **Heap-based Implementation:** Using a binary heap to prioritize nodes, Dijkstra's algorithm can achieve a time complexity of $O(M \log N)$, where $N$ is the number of nodes and $M$ is the number of edges in the graph. This implementation is efficient for sparse graphs. - **Array-based Implementation:** In a simpler array-based implementation, where we linearly search for the node with the smallest distance at each step, the time complexity is $O(N^2)$. This can be more efficient for dense graphs. ## Shortest Paths with Unit Edge Weights When all edge weights are equal to one (unit edge weights), we can use a more specialized and efficient algorithm. ### Breadth-First Search (BFS) Breadth-First Search (BFS) is ideally suited for finding shortest paths in graphs with unit edge weights. Starting from the source node, BFS explores the graph layer by layer, guaranteeing that the first time a node is reached, it is through a shortest path in terms of the number of edges. BFS uses a queue data structure to manage the nodes to be visited, achieving a time complexity of $O(N + M)$. # Shortest Paths in Directed Acyclic Graphs (DAGs) This section focuses on finding shortest paths in Directed Acyclic Graphs (DAGs). ## Significance of DAGs DAGs are significant because they represent dependencies and precedence relationships without cycles. This property is crucial in various scenarios, such as task scheduling, dependency resolution in software projects, and project planning. ### Absence of Negative Cycles A key characteristic of DAGs is that they inherently cannot contain cycles, and therefore, they certainly cannot contain negative cycles. This is important because the presence of negative cycles can make the shortest path problem ill-defined or computationally harder in general graphs. In DAGs, we are guaranteed to find shortest paths efficiently. ## Topological Sorting Topological sorting is a linear ordering of vertices such that for every directed edge from vertex $u$ to vertex $v$, vertex $u$ comes before vertex $v$ in the ordering. Topological sorting is possible if and only if the graph is a DAG. ### Definition and Properties ::: tcolorbox A topological sort of a DAG $G = (V, E)$ is a linear ordering of its vertices such that for every directed edge $(u, v) \in E$, $u$ comes before $v$ in the ordering. ::: A key property of DAGs is the existence of at least one source node (a node with no incoming edges). Topological sorting algorithms typically start from source nodes. ### Algorithm for Topological Sort :::: tcolorbox ::: algorithmic $in\_degree \leftarrow$ Compute in-degree for each vertex $v \in V$ $Q \leftarrow$ Queue of all vertices with $in\_degree = 0$ $L \leftarrow$ Empty list to store sorted vertices $u \leftarrow$ Dequeue from $Q$ Append $u$ to $L$ $in\_degree[v] \leftarrow in\_degree[v] - 1$ Enqueue $v$ into $Q$ $L$ \"Graph has a cycle\" ::: :::: ### Illustrative Example ::::: tcolorbox Consider the following DAG: ::: center ::: Applying topological sort, one possible ordering is: 1, 2, 4, 3, 6, 5, 7, 8. After relabeling the nodes according to this topological order, we get: ::: center ::: In this topological ordering, every edge $(u, v)$ goes from a smaller index $u$ to a larger index $v$. ::::: ### Non-Uniqueness of Topological Order ::: tcolorbox It is important to note that topological sorting is not unique. If there are multiple source nodes at any step, the order in which they are processed can lead to different valid topological sorts. ::: ## Shortest Path Algorithm for DAGs We can efficiently find shortest paths in DAGs using dynamic programming, leveraging the topological order of the vertices. ### The 'len' Array We use an array, denoted as 'len', where 'len\[i\]' will store the length of the shortest path from a source node $S$ to node $i$. We assume, without loss of generality, that $S$ is the first node in the topological sort. ### Initialization of 'len' Initialize the 'len' array as follows: - $\len[S] = 0$, where $S$ is the source node. - $\len[i] = \infty$ for all other nodes $i \neq S$. (Alternatively, we can calculate values progressively without explicit initialization to infinity for nodes we haven't reached yet in the topological order). If we are interested in paths from a specific source $S$ to a target $T$, and the topological sort indices range from 1 to $N$, we are concerned with the portion of the 'len' array from index $S$ to $T$. Nodes with indices less than $S$ are unreachable from $S$ in a topologically sorted DAG, and nodes with indices greater than $T$ are not relevant if we are only interested in paths up to $T$. ### Dynamic Programming Approach We process the nodes in topological order. For each node $i$ from $S+1$ to $T$, we calculate $\len[i]$ using the following recurrence relation: ::: tcolorbox $$\len[i] = \min_{(j, i) \in E} \{ \len[j] + c(j, i) \}$$ where $c(j, i)$ is the cost of the edge from node $j$ to node $i$. This formula considers all incoming edges to node $i$. Since we are processing nodes in topological order, when we calculate $\len[i]$, the values $\len[j]$ for all predecessors $j$ of $i$ are already computed. ::: ### Time Complexity Analysis For each node $i$, we iterate through its incoming edges. The total time complexity is the sum of the in-degrees of all nodes from $S$ to $T$. In the worst case, if we consider all nodes in the DAG, the total time complexity is proportional to the sum of all in-degrees, which is equal to the number of edges $M$. Therefore, the time complexity of this algorithm is $O(N + M)$, which is linear in the size of the graph, assuming we include the time to initialize and iterate through nodes. ## Detailed Example Let's apply the shortest path algorithm to the DAG example we saw earlier, aiming to find the shortest path from node 1 to node 8. ### Step-by-Step Calculation We initialize $\len[1] = 0$ and consider nodes in the topological order we found earlier: 1, 2, 4, 3, 6, 5, 7, 8. - **Node 1**: $\len[1] = 0$. - **Node 2**: Predecessor is 1. $\len[2] = \len[1] + c(1, 2) = 0 + 3 = 3$. Parent of 2 is 1. - **Node 4**: Predecessor is 2. $\len[4] = \len[2] + c(2,4)$. Assuming edge (2,4) exists in the original graph, though not shown in the diagram. Let's assume c(2,4) = 10 for this example. Then $\len[4] = 3 + 10 = 13$. Parent of 4 is 2. - **Node 3**: Predecessor is 2. $\len[3] = \len[2] + c(2, 3) = 3 + 4 = 7$. Parent of 3 is 2. - **Node 6**: Predecessors are 2 and 3. - From 2: $\len[2] + c(2, 6) = 3 + 8 = 11$. - From 3: $\len[3] + c(3, 6) = 7 + 5 = 12$. $\len[6] = \min(11, 12) = 11$. Parent of 6 is 2. - **Node 5**: Predecessors are 4 and 3. - From 4: $\len[4] + c(4, 5) = 13 + 4 = 17$. - From 3: $\len[3] + c(3, 5) = 7 + 8 = 15$. $\len[5] = \min(17, 15) = 15$. Parent of 5 is 3. - **Node 7**: Predecessors are 6, 5 and 4. - From 6: $\len[6] + c(6, 7) = 11 + 10 = 21$. - From 5: $\len[5] + c(5, 7) = 15 + 3 = 18$. - From 4: $\len[4] + c(4,7) = 13 + 10 = 23$. $\len[7] = \min(21, 18, 23) = 18$. Parent of 7 is 5. - **Node 8**: Predecessors are 7 and 1. - From 7: $\len[7] + c(7, 8) = 18 + 1 = 19$. - From 1: $\len[1] + c(1, 8) = 0 + 20 = 20$. $\len[8] = \min(19, 20) = 19$. Parent of 8 is 7. Thus, the shortest path length from node 1 to node 8 is 19. ### Reconstructing the Shortest Path To reconstruct the shortest path, we backtrack from the destination node (8) to the source node (1) using the parent pointers we recorded during the calculation. - Start at node 8. Parent is 7. - From node 7. Parent is 5. - From node 5. Parent is 3. - From node 3. Parent is 2. - From node 2. Parent is 1. - We reached the source node 1. Reversing this sequence, the shortest path is $1 \rightarrow 2 \rightarrow 3 \rightarrow 5 \rightarrow 7 \rightarrow 8$ with a total length of 19. # Application: Scheduling Problems Directed Acyclic Graphs have significant applications in scheduling problems. ## Introduction to Scheduling Scheduling problems involve managing jobs and resources to optimize certain objectives, often under constraints. ### Jobs and Resources In scheduling, we have a set of jobs or tasks that need to be processed using available resources, such as machines, processors, or personnel. ### Precedence Constraints Precedence constraints define the order in which jobs must be executed. For example, job A must be completed before job B can start. These constraints are naturally represented by directed edges in a graph, where an edge from A to B indicates that A must precede B. ## Modeling Scheduling with DAGs DAGs are a natural and effective way to model scheduling problems with precedence constraints. ### Nodes Representing Jobs/Tasks In a DAG scheduling model, each node in the graph represents a job or task that needs to be performed. ### Edges Representing Precedence Relations A directed edge from node $i$ to node $j$ signifies that job $i$ must be completed before job $j$ can start. ### Job Processing Times Each job $j$ is associated with a processing time $P_j$, which is the duration required to complete the job. ## Scheduling Constraints Besides precedence constraints, scheduling problems may involve other types of constraints. ### Precedence Constraints As discussed, these are the core constraints in many scheduling problems, dictating the order of job execution. ### Release Dates A release date $r_j$ for job $j$ specifies the earliest time at which job $j$ can start its processing. Job $j$ cannot start before time $r_j$. ### Deadlines A deadline $d_j$ for job $j$ specifies the latest time by which job $j$ must be completed. Job $j$ must finish by time $d_j$. ## Critical Path Method (CPM) The Critical Path Method (CPM) is a project management technique used to schedule, organize, and coordinate tasks within a project. It is particularly useful for projects with well-defined tasks and precedence constraints. ### Adding Fictitious Start and End Nodes In CPM, it is often helpful to introduce fictitious start and end nodes. A start node (node 0) represents the beginning of the project, and an end node (node $n+1$) represents the completion of the project. These nodes have zero processing time. Edges are added from the start node to all jobs that have no predecessors, and from all jobs that have no successors to the end node. ### Earliest Starting Time (ET) The Earliest Starting Time ($\ET[i]$) for a job $i$ is the earliest possible time at which job $i$ can begin, considering all precedence constraints. #### Calculation as a Longest Path Problem In CPM, to find the earliest starting time, we treat the problem as a longest path problem in the DAG. The $\ET[i]$ for each job $i$ is the length of the longest path from the start node (0) to node $i$. We can use a similar dynamic programming approach as for shortest paths, but instead of minimizing, we maximize: ::: tcolorbox $$\ET[i] = \max_{(j, i) \in E} \{ \ET[j] + P_j \}$$ with $\ET[0] = 0$. The edge weights in this longest path calculation are the processing times of the predecessor jobs. ::: ### Makespan Definition The Makespan ($\makespan$) is the total time required to complete all jobs in the project. It is the completion time of the last job to finish. In CPM, the makespan is given by the earliest starting time of the fictitious end node, $\makespan = \ET[n+1]$. ### Latest Starting Time (LT) The Latest Starting Time ($\LT[i]$) for a job $i$ is the latest possible time at which job $i$ can start without delaying the project completion time (makespan). #### Calculation of LT To calculate $\LT[i]$, we work backward from the end node ($n+1$). We set $\LT[n+1] = \makespan$. Then, for each job $i$, we calculate $\LT[i]$ as: ::: tcolorbox $$\LT[i] = \min_{(i, j) \in E} \{ \LT[j] - P_i \}$$ This calculation ensures that if job $i$ starts at time $\LT[i]$, all subsequent jobs can still be completed without exceeding the makespan. ::: ### Slack Time The Slack Time (or float) for a job $i$ is the amount of time that job $i$ can be delayed without delaying the project makespan. It is calculated as: $$\text{Slack}[i] = \LT[i] - \ET[i]$$ ### Identifying Critical Jobs and Paths A critical job is a job with zero slack time ($\text{Slack}[i] = 0$). These jobs must start on time to avoid delaying the project. A critical path is a path from the start node to the end node consisting only of critical jobs. The critical path determines the minimum project completion time (makespan). ## Illustrative Example: Document Preparation Consider the example of preparing a document, such as a thesis. ### Task Decomposition Tasks involved in document preparation might include: - Write Section A - Write Section B - Print Section A - Print Section B - Bind A + B - Buy Envelope - Write Address - Package - Deliver to Secretary - Mail Package ### Precedence Graph Construction We can construct a precedence graph based on dependencies. For example, \"Print Section A\" must come after \"Write Section A\". \"Bind A + B\" must come after both \"Print Section A\" and \"Print Section B\". We also need an edge from \"Bind A+B\" to \"Package\". ### Application of CPM By assigning processing times to each task and applying CPM, we can determine: - The earliest start time for each task. - The latest start time for each task without delaying the completion. - The critical path, which identifies the sequence of tasks that directly impacts the project completion time. - The makespan, which is the minimum time to complete the document preparation. This analysis helps in identifying critical tasks that need close monitoring and resource allocation to ensure timely project completion. # Conclusion In this lecture, we explored efficient algorithms for finding shortest paths in Directed Acyclic Graphs (DAGs) using topological sorting and dynamic programming. We saw that DAGs, due to their acyclic nature, simplify the shortest path problem and allow for linear-time solutions. We then applied these concepts to scheduling problems, introducing the Critical Path Method (CPM). CPM helps in project management by identifying critical tasks and paths, enabling efficient resource allocation and project planning. Key takeaways from this lecture include: - DAGs are a special class of graphs that simplify shortest path calculations. - Topological sorting is a fundamental algorithm for DAGs, enabling efficient processing. - Dynamic programming provides a linear-time algorithm for shortest paths in DAGs. - CPM is a powerful application of DAGs and longest path algorithms for project scheduling and critical task identification. Further questions to consider: - How can we handle more complex scheduling constraints, such as resource limitations or job priorities? - How do these methods extend to project management scenarios with uncertainties in task durations? - What are other applications of shortest and longest path algorithms in DAGs beyond scheduling? This concludes our lecture on shortest paths in DAGs and scheduling problems.