COMS 482: unofficial class blog


Lecture 32: Center Selection

Posted in Class Notes by Elliott Back on April 15th, 2005. [Del.icio.us]

Center Selection (CS): Given n towns, place k malls, minimizing the worst driving distance. The CS decision problem can be written: “Given n points, a radius r, and an integer k, can we place k disks of radius r so that all points are covered?”

dist(*,*) must satisfy:

  • dist(s, s) = 0 (reflexive)
  • dist(s, t) = dist(t, s) (symmetry)
  • dist(s, u) <= dist(s, t) + dist(t, u) (triangle inequality)

CSdecision <=p CSoptimization: Optimization is NP-hard!

Suppose that we know r and want to find a solution, i.e. choosing the centers. If we already know one of the centers, uf we were to use radius 2r then any point within distance r can be used as the center, and all points will still be covered. So, pick any point s in the disk; the circle with center s and radius 2r covers all the same points.

If there is a technique that produce an answer where radius <= (2 - epsilon)r*, then p = NP. In the plane, using Euclidean distance, if radius <= (1.89 …)r*, then P = NP.

Greedy Algorithm: Let S be the set of sites and C be the set of centers. We are given r.

C = {}
while S != {}
    choose s in S
    C = C + {s}
    delete all s’ in S within 2r of s
end
if |C| <= k return C
else report “No solution for radius r possible”

Claim: “If there is a solution using k radius-r disks, then the algorithm returns a solution using <= k disks of radius 2r.

Each s chosen by the algorithm is in some disk (c*, r) for some c*. Then the circle (s, 2r) covers all the same points covered by (c*, r) plus some more. Thus, later choices of s are not covered by the circle (c*, r). Each s chosen is covered by a different c* circle. There are at most |c*| = k choices for s. The algorithm produces <= k disks of radius 2r covering all points.

Method: Use a binary search to find the right radius. First, box the set of points and find the maximum seperation between any 2 points. This is R and r <= R. We can actually do better! During the algorithm we need a point at distance >= 2r from chosen site s. Always choose the fathest point. For later steps, choose si farthest from existing centers.

New algorithm:

Select any site s in S, let C = {s}
while |c| < k
    select s in S that maximizes that distance from s to any point in C
    C = C + {s}
end
return C

Claim: “If there’s a solution (C*r, r) where |C*r|=k, then the algorithm returns a solution (C, 2r) where |c| = k.

Consider the previous algorithm. We know there is an optimal value for r. Let’s use that r. Each s that is chosen in the new algorithm would be a valid s in the previous algorithm. This holds until we eliminate all points in S, therefore we produce a cover (C, 2r).

Credit to Kevin Canini for the notes!

Lecture 31: Greedy Reductions

Posted in Class Notes by Elliott Back on April 13th, 2005. [Del.icio.us]

We can take a greedy approach, minimum makespan. Given jobs for machines with times t1, t2, … , tn, the total time from start to finish of the slowest machine is as small as possible. Example, three machines with jobs 2, 2, 2, 3, 4, 6:

M1: 2, 2, 2
M2: 3, 4
M3: 6

We can write this as a decision problem. Given the number of machines and a list of job times, and a bound k, is there a way to assign jobs so that Makespan <= k? Makespan is NP-complete, by reduction from Subset Sum! Given a target W and w1, w2, … , wn, split into sets W and S - W where S = sum(wi). Take |W - (S - W)| = |2W - S| = W0, a new constant in the list which forces the best solution where both subsets have the same sum = max(W, S - W).

Given a Subset Sum problem, create W0 = |S - 2W| and let the w’s be the times of jobs. Ask if Makespan works on 2 machines of size <= (S + W0) / 2.

Claim: “There is a solution to Subset Sum if and only if there is a solution to Makespan.”

Assume a Subset Sum solution. Split the w’s into subsets that sum to W and S - W. W0 was chosen so we can make them equal, so there exists a Makespan = (S + W0) / 2, and some machine has a subset of jobs summing to W.

Greedy Strategy: Assign each job to least busy machine, incrementally. How good is this? Call the best Makespan possible T*. The sum(ti) = total time to be spread over m machines. Some machine must have more than sum(ti) / m, so T* >= sum(ti) / m. Consider Mi, the machine with worst time; it’s load is T = Ti. In general, Tj = load on Mj. Our makespan answer is T1 by the greedy algorithm.

Let tj be the last time assigned to Mi. At that moment, Mi’s load is Ti - tj. That must be the smallest load, because we chose it for the last job. So Ti - tj <= Tk for all k.

For each machine: M(Ti - tj) <= sum(Tk) = sum(tj). So Ti - Tj < = sum(tj) / m &;t= T*. Ti &lt= T* + tj so T = Ti <= T* + tj and T* >= worst tk >= tj, so T <= T* + T* -> T <= 2 T*.

The greedy algorithm is always within twice the optimal value. If we presort and do the largest jobs first, we can show that T/T* <= 3/2 and T/T* <= 4/3.

Credit to Kevin Canini for the notes!

Lecture 30: Review

Posted in Class Notes by Elliott Back on April 11th, 2005. [Del.icio.us]

We reviewed for the prelim II today. See previous post: cs482.elliottback.com/archives/2005/04/10/prelim-ii-next-tuesday/

Prelim II Next Tuesday

Posted in Exams by Elliott Back on April 10th, 2005. [Del.icio.us]

Prelim II is next Tuesday. The announcement isn’t on the course website yet, but here are review questions from 2000 spring, for the eager:

  1. You are given a graph G = (V, E) with weights we on its edges {e in E}. The weights can be negative or positive. The Zero-­weight-cycle problem is to decide if there is a simple cycle in G so that the sum of the edge weights on this cycle is exactly 0.

    Prove that this problem is NP­complete.

  2. Since the 3­Dimensional Matching problem is NP­complete, it is natural to expect that the corresponding 4­Dimensional Matching problem is at least as hard. Let us define 4­Dimensional Matching as follows. Given sets W , X , Y , and Z, each of size n, and a collection C of ordered 4­tuples of the form (wi , xj, yk, zl), do there exist n 4-­tuples from C so that no two have an element in common?

    Prove that 4­Dimensional Matching is NP­complete.

  3. Some of your friends have recently graduated and started a small company called WebExodus, which they are currently running out of their parents’ garages in Santa Clara. They’re in the process of porting all their software from an old system to a new, revved­up system; and they’re facing the following problem.

    They have a collection of n software applications, {1, 2, . . . , n}, running on their old system; and they’d like to port some of these to the new system. If they move application i to the new system, they expect a net (monetary) benefit of bi >= 0. The different software applications interact with one another; if applications i and j have extensive interaction, then the company will incur an expense if they move one of i or j to the new system but not both — let’s denote this expense by xi,j >= 0.

    So if the situation were really this simple, your friends would just port all n applications, achieving a total benefit of sum(bi . Unfortunately, there’s a problem . . . Due to small but fundamental incompatibilities between the two systems, there’s no way to port application 1 to the new system; it will have to remain on the old system. Nevertheless, it might still pay to port some of the other applications, accruing the associated benefit and incurring the expense of the interaction between applications on different systems.

    So this is the question they pose to you: which of the remaining applications, if any, should be moved? Give a polynomial­ time algorithm to find a set S # {2, 3, . . . , n} for which the sum of the benefits minus the expenses of moving the applications in S to the new system is maximized.

Update:

New review problems have been posted here: www.cs.cornell.edu/courses/cs482/2005sp/review2.htm

Lecture 29: Handling NP-Complete Problems

Posted in Class Notes by Elliott Back on April 6th, 2005. [Del.icio.us]

There are two general strategies for dealing with those recalcitrant NP-complete problems. You can either take advantage of special characteristics of the problem, or you can come up with an approximate solution. In this lecture, we will be looking at the former.

Imagine what happens when you run Independent Set on a Tree:

Independent Set as Tree

The nodes of the indepent set are labelled in red. How do we get an independent set for a tree? Simply, choosing the leaves first is optimal:

Let T = (V, E) be a tree and v be a leaf vertex. I claim that there is a maximum size independent set that contains v. Consider a max size independent set S. If v is in S, we are done. Assume v is not in S. Let u be a parent of v. If u is not in S, we can put v in S instead to get a larger independent set. If u is in S, we can take out u and put v into S to get S’, an independent set of same size = max size.

Here is an algorithm:

Place all leaves into S
while S is not empty {
    Color the leaves
    Delete leaves and parents
}

You can implement this in linear time using DFS–and even if your graph is not a tree, it still may have some tree-like pieces.

Sample NP-Complete Reductions:

Given n processes and m resources:

a) Each process needs several resources at once. Is there a way for k processes to be active at the same time?

Some ideas include Set Packing, Independent Set. To do it with Set Packing, you would ask if there are k subsets of processes that don’t intersect over resources.

b) Special case of (a) where k = 2.

c) Special case of (a) where there are 2 types of resources and processes require at most one of each type.

d) Each resource is requested by at most 2 processes.

« Previous PageNext Page »