If the setup time of a job is independent of the job scheduled before it, then we can simply include the setup time in with the processing time and find optimal schedules as if there were no setup time.

In situations with sequence dependent setup times, often the required problem is to schedule a set of jobs so as to minimize the total setup time. If after processing the last job a similar set of jobs becomes available, the required schedule will be cyclic and the problem is equivalent to the Traveling Salesman Problem (TSP). Each job corresponds to a city that a salesman must visit, each setup time corresponds to the distance between two cities, and the optimal schedule is an ordering of the cities in which the salesman visits each exactly once before returning to some arbitrary starting city.

In general, the TSP is difficult to solve, and often a suboptimal solution must be used. The following greedy algorithm, which finds a reasonable but not necessarily optimal solution, is known as the nearest city method as it constructs schedules in which the salesman next visits the nearest city he has not yet been to.

Some arbitrary city is scheduled first. Until all cities are chosen, the unscheduled city closest to the city last scheduled is scheduled next.

Whenever there is a tie for the next city, as there will be in choosing the first, all possibilities must be considered, and so many schedules will be generated. From all of these possible solutions a schedule with smallest total setup time is chosen as the solution, which may or may not be optimal.

Let `A` be a dominant set of solutions to some problem,
in particular it could be the set of all possible solutions.

Suppose we have some easy method of partitioning `A` into two (or more) subsets of solutions,
say `A _{1}` and

Now suppose that for any problem we have an easy method of
bounding
the value of the best possible solution.
That is, if we are minimizing a function and we know that over all possible
solutions none could have a value less than say `X`,
then `X` is a
lower bound
on the set of solutions to the problem,
although the optimal value may actually be much larger than `X`.

The subproblems generated cannot be solved simultaneously,
and so some decision must be made as to which to solve next.
One important factor is that when a solution is found to be better
than the incumbent, then all unsolved subproblems with lower bounds not less
than this value may immediately be discarded.
Thus it might be a good idea to always choose the subproblem with the smallest
lower bound, both because it looks most promising in terms of finding an optimal solution,
and because by postponing the solving of subproblems with large bounds we may be able to avoid them altogether.
Such a strategy is known as
jumptracking,
because in the tree representing the subproblems the next subproblem chosen
may come from anywhere on the tree.
Now while this may lead to an optimal solution fairly quickly,
it has the disadvantage that the number of subproblems available for solving
may become quite large, and in fact may approach n! in the worst case.
If instead we use a strategy known as
backtracking
or
depth first search,
then the number of unsolved problems is never much greater than n.
This method completely solves each subproblem before moving on to the next.
Thus if `A` were partitioned into `A _{1}` and

The following algorithm is known as branch and bound as it uses these two concepts to find an optimal solution. We will refer to the incumbent solution as the best known solution to the problem. If we don't have one, its value may initially be set arbitrarily large.

If the problem's lower bound is not less than the value of the incumbent, then we need not solve the problem as we cannot possibly find a better solution than the one we already have. Otherwise if the problem can be easily solved, we do so and then replace the incumbent if a better value is found. Otherwise if we cannot solve the problem easily, we partition it into smaller subproblems and solve each of them using the branch and bound algorithm.

For the Traveling Salesman Problem, the following branch and bound solution, due to
Little, Murty, Sweeny, and Karel
[Little, J.D.C.; Murty, K.G.; Sweeny, D.W.; Karel, C.
An algorithm for the traveling salesman problem

.
Operations Research, 11.6(1963)]
may be used to find an optimal solution.
An incumbent may be obtained by the nearest city heuristic.

The total setup time in any solution will consist of exactly one entry from each row and each column of the setup times matrix. Thus adding a constant to every entry in a row or column will change the value of every solution by that amount. If the smallest value in each row is subtracted from the row, and the smallest remaining value in each column is subtracted from the column, then the resulting matrix will have at least one zero in every row and column, and the total amount subtracted will be a lower bound on the original problem as any solution to the reduced matrix will differ by this amount from the true total setup time.

Branching consists of choosing some ordered pair of cities that either
will or will not be in the schedule.
If the pair (`i,j`) is chosen, then a value of infinity may be inserted into position [`i,j`]
in the matrix and the resulting problem in which city `j`
never immediately follows city `i`, may be solved by branch and bound.
The corresponding problem in which city `j` always follows `i` is generated
by setting entry [`i,j`] to 0 and the rest of row `i` and column `j` to infinity.
The entry in row `j` corresponding to the first city that
*must*
have been visited before `j` is set to infinity to prevent subtours.

The pair to branch on is determined by finding, with one level of look ahead,
which choice would be the worst not to take.
If entry [`i,j`] were not chosen, then as described above entry [`i,j`] would be set to infinity,
and the resulting matrix reduced by the smallest element in row `i` and column `j`
to find the new lower bound.
Thus we consider each zero entry in the matrix and associate with it a value
equal to the sum of the smallest entries in its row and column.
The branch is determined by choosing the entry with the largest such value,
as it has the worst penalty for not being chosen.