Skip to content

Commit

Permalink
adding branch and bound for KP
Browse files Browse the repository at this point in the history
  • Loading branch information
jmsallan committed Aug 13, 2017
1 parent 92ececb commit a4fb38b
Show file tree
Hide file tree
Showing 4 changed files with 260 additions and 9 deletions.
Binary file modified .DS_Store
Binary file not shown.
Empty file added KP/.Rhistory
Empty file.
252 changes: 252 additions & 0 deletions KP/2017-03-22gaKP.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@
---
title: "A genetic algorithm for the Knapsack problem"
output:
html_document: default
pdf_document: default
word_document: default
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```


The aim of this document is to show how works a genetic algorithm (GA) for the knaspack problem. It is not a good alternative to solve these problems, but it can help to show how GAs work.

#Defining test instances

To test the algorithm, we define first a function that generates instances of the knapsack problem. It is based in recommendations of

Pisinger, D. (2005). Where are the hard knapsack problems?. Computers & Operations Research, 32(9), 2271-2284.

The utility of each instance is quite close to its weight, so it can be somewhat hard to solve:

```{r KP instances function}
KPInstance01 <- function(R, size, h, seed=1212){
set.seed(1212)
w <- sample(1:R, size, replace=TRUE)
u <- sapply(1:size, function(x) sample((w[x] + R/10 - R/500):(w[x] + R/10 + R/500), 1))
W <- round(h/(101)*sum(w))
return(list(w=w,u=u,W=W))
}
```

Every instance is encoded as a list containing the weights `w`, utilities `u` and knapsack capacity `W`.

#Defining the GA

##Crossover and mutation operators

In this section, the building blocks of the genetic algorithm are defined. First I define the **fitness function**. It is equal to total utility for feasible solutions, and equal to zero to non-feasible solutions:

````{r KP fitness}
FitnessKP <- function(dades, sol){
if(sum(dades$w*sol) <=dades$W) fit <- sum(dades$u*sol)
else fit <- 0
return(fit)
}
````

The `CrossoverKP1point` function defines a 1-point **crossover operator** for the KP. It takes for granted that we encode the solution as a binary string of length equal to the number of items `n`.

```{r KP 1-point crossover}
CrossoverKP1point <- function(sol1, sol2){
n <- length(sol1)
point <- sample(1:(n-1), 1)
son <- c(sol1[1:point], sol2[(point +1):n])
return(son)
}
```

To test the operator, here are several crossovers with a all-zeroes and a all-ones strings:

```{r KP crossover test}
set.seed(1213)
a <- rep(1, 10)
b <- rep(0, 10)
for(i in 1:5) print(CrossoverKP1point(a,b))
```

`MutationKP3points` defines a three-point **mutation operator**. It changes the value of three consecutive positions of a string:

```{r KP mutation}
MutationKP3points <- function(sol){
n <- length(sol)
point <- sample(1:(n-2), 1)
sol <- as.logical(sol)
sol[point:(point+2)] <- !sol[point:(point+2)]
sol <- as.numeric(sol)
return(sol)
}
```

Let's run five times the mutation operator for a all-ones string:

```{r KP mutation show}
ex <- rep(1, 10)
for(i in 1:5) print(MutationKP3points(ex))
```


##The genetic algorithm function

Here is the function that runs the GA for an instance of KP:

```{r gaKP}
GeneticAlgorithmKP <- function(dades, npop, iterations, pmut, elitist =TRUE, verbose=FALSE){
#problem size
n <- length(dades$w)
#initial population
rel.size <- min(dades$W/sum(dades$w),1)
population.parent <- replicate(npop, sample(0:1, n, replace = TRUE, prob=c(1-rel.size, rel.size)), simplify=FALSE)
#initializing next generation
population.son <- replicate(npop, numeric(n), simplify=FALSE)
best.obj <- 0
best.sol <- numeric(n)
count <- 1
while(count <= iterations){
#assess fitness of population parent
fitness <- sapply(population.parent, function(x) FitnessKP(dades, x))
#finding best solution
iter.obj <- max(fitness)
pos.max <- which(fitness == max(fitness))[1]
iter.sol <- population.parent[[pos.max]]
#update best solution
if(iter.obj > best.obj){
best.sol <- iter.sol
best.obj <- iter.obj
}
#elitist strategy: replace worst solution by best solution so far
if(elitist){
pos.min <- which(fitness == min(fitness))[1]
population.son[[pos.min]] <- best.sol
fitness[pos.min] <- best.obj
}
#verbose printing
if(verbose & (count%%10==0 | count==1)) {
print(paste("population", count))
sapply(1:npop, function(x) print(c(population.parent[[x]], fitness[x]), digits=0))
}
#computing probability of selection
prob <- fitness^2
prob <- prob/sum(prob)
#creating son population
for(i in 1:npop){
#crossover
parents <- sample(1:npop, 2, prob = prob)
v1 <- population.parent[[parents[1]]]
v2 <- population.parent[[parents[2]]]
population.son[[i]] <- CrossoverKP1point(v1, v2)
#mutation
if(pmut > runif(1)) population.son[[i]] <- MutationKP3points(population.son[[i]])
}
#population son becomes parent
population.parent <- population.son
count <- count + 1
}
return(list(sol=best.sol, obj=best.obj))
}
```

Some comments to this function:

* Function parameters are: instance data `dades`, population size `npop`, number of `iterations` to be performed and probability of mutation `pmut`. If the logical `verbose` is set to `TRUE`, the function prints the population and fitness of each element every ten runs.
* Selection probability is proportional to the square of the fitness function of each instance. Then, all unfeasible solutions will have selection probability equal to zero.
* The GA may be stuck if all solutions of a population are unfeasible. To avoid this, I have included in each element of the starting population a number of items proportional to the ratio weight of knapsack vs. total weight of items.
* If the flag `elitist` is true, in each generation one of the worst solutions is replaced by the best solution obtained so far.

#Generating instances

To test the algorithm, I have generated three instances:

```{r setting instancesJP}
InstanceKP10 <- KPInstance01(1000, 10, 40)
InstanceKP50 <- KPInstance01(1000, 50, 40, 2424)
InstanceKP100 <- KPInstance01(1000, 100, 40, 3333)
```

In the case of the KP, we can obtain the optimal solution using linear programming. This function returns the results of linear programming in the same format of the GA function:

```{r LPKP}
LPKP <- function(dades){
library(Rglpk)
n <- length(dades$u)
obj <- dades$u
mat <- matrix(dades$w, nrow=1)
types <- rep("B", n)
sol.lp <- Rglpk_solve_LP(obj=obj, mat=mat, dir="<=", rhs=dades$W, types=types, max = TRUE)
return(list(sol = sol.lp$solution, obj = sol.lp$optimum))
}
```

Then we proceed to solve instances of size 10 and 50 with linear programming:

```{r solve LPKP, message=FALSE}
solPL.KP10 <- LPKP(InstanceKP10)
solPL.KP50 <- LPKP(InstanceKP50)
```

#GA in action

Let's run the genetic algorithm for `InstanceKP10` and probability of mutation equal to zero:

```{r runga01}
set.seed(4444)
solGA.KP10 <- GeneticAlgorithmKP(InstanceKP10, 10, 50, 0, elitist=FALSE, verbose=TRUE)
```

The algorithm gets stuck in the best feasible solution of the initial population. Let's run the algorithm with nonzero probability of mutation and elitist strategy:

```{r runga02}
set.seed(4444)
solGA.KP10 <- GeneticAlgorithmKP(InstanceKP10, 10, 100, 0.2, elitist=TRUE, verbose=TRUE)
```

Finally, we will run the algorithm with 1000 iterations:

```{r runga04}
set.seed(4444)
solGA.KP10 <- GeneticAlgorithmKP(InstanceKP10, 10, 1000, 0.2, elitist=TRUE, verbose=FALSE)
solGA.KP10
solPL.KP10
```

For the instances of size 50 and 100 we have:

```{r runga05}
set.seed(4444)
solGA.KP50 <- GeneticAlgorithmKP(InstanceKP50, 10, 1000, 0.2, elitist=TRUE, verbose=FALSE)
solGA.KP50
solPL.KP50
```

```{r runga06}
set.seed(4444)
solGA.KP100 <- GeneticAlgorithmKP(InstanceKP100, 10, 1000, 0.2, elitist=TRUE, verbose=FALSE)
solGA.KP100
```

The objective value of the optimal solution of the instance with 100 items is 24643. The genetic algorithm works well solving small instances of the knapsack problem.


17 changes: 8 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# heuristics

This is the gitHub repository of heuristics of Jose M Sallan. Here you will find quite simple heuristics to attack combinatorial problems, as the code here is set for teaching purposes.
This is the gitHub repository of heuristics of [Jose M Sallan](http://josemsallan.blogspot.com/). Here you will find quite simple heuristics to attack combinatorial problems, as the code here is set for teaching purposes.

At this moment this repository contains the following material
At this moment this repository contains the following material:

##Travelling Salesperson Problem (TSP)
## Travelling Salesperson Problem (TSP)

* **functionsTSP.R** contains several functions and algorithms to tackle the TSP using R, including an implementation of the nearest neighbor heuristic, and implementations of hill climbing, tabu search, simulated annealing and genetic algorithms. The file also contains a function to generate random instances of a given size.
* **demoTSP.R** contains examples of use of the functions of the previous file, one on a randomly generated instance of size 20, and another on the **att48** instance.
Expand All @@ -15,13 +15,12 @@ At this moment this repository contains the following material
## Quadratic assignment problem (QAP)

* In **QAPfunctions.R** are listed several functions and algorithms to tackle the QAP using R, based on the ones defined for the TSP.
* **QAPresults** solves several instances of QAP using the heuristics defined in **QAPfunctions.R**. Illustrates how to get instances from the Internet, and how to use markdown to write reproducible reports.
* **QAPtypewriter.R** evaluates several large instances of QAP as in the previous file. Here I have used the stargazer package to build text and LaTeX outputs.
* **QAPresults** solves several instances of QAP using the heuristics defined in **QAPfunctions.R**. Illustrates how to get instances from the Internet, and how to use markdown to write reproducible reports. Results are stored in **QAPresults.RData**.
* **QAPtypewriter.R** evaluates several large instances of QAP as in the previous file. Here I have used the stargazer package to build text and LaTeX outputs. Results can be found in **QAPresultsTypewriters.RData**.

##Knapsack Problem (KP)
## Knapsack Problem (KP)

I have developed a genetic algorithm to solve the Knapsack problem. It can be easily tackled through linear programming, but this GA is illustrative about how to deal with a maximum problem with unfeasible solutions. The heuristic and the instances are packed in a single script.

In the **functionsKP.R** can be found several instance generators of the KP and of the subset sum (SS) problem. The SS is a variant of the KP where all items have an utility equal to its weight. There are two heuristics implemented: a genetic algorithm (which works somewhat well for small instances) and a branch and bound, which returns the optimal solution.
* I have developed a genetic algorithm and a branch and bound method to solve the Knapsack problem (KP). KP can be easily tackled through linear programming, but I have used GA to illustrate about how to deal with a maximum problem with unfeasible solutions. The heuristic and the instances are packed in a single script. The script **functionsKP.R** contains the functions that execute both algorithms, together with instance generators for the KP and the subset sum(SS) problem. The SS is a variant of the KP where all items have an utility equal to its weight. The script **demoKP.R** demonstrates how do these functions work.

* The **2017-03-22gaKP** files are a RMarkdown document explaining how the GA works.

0 comments on commit a4fb38b

Please sign in to comment.