The Student Room Group

Minimum cost of conversion

Hello!!!! :smile:
We are given 2 strings A=[1m] A=[1 \dots m] and B[1n] B[1 \dots n] and the following 3 operations are allowed:



Insert a charachter,with cost cic_i

Delete a character,with cost cdc_d

Replace a character,with cost crc_r








We are looking for the optimal sequence of operations(sequence of operations with the minimum cost),for the conversion of the string AA to the string BB.


T(i,j)= the minimum cost of the conversion of the string A[1i] to B[1j],1im,1jnT(i,j)=\text{ the minimum cost of the conversion of the string } A[1 \dots i] \text{ to } B[1 \dots j], 1 \leq i \leq m, 1 \leq j \leq n


According to my notes:
T(i,j)=min{[br]T(i1,j)+cd // delete the i-th charachter of A[br]T(i,j1)+ci // insert the j-th charachter of B[br]T(i1,j1)if A[i]=B[j][br]T(i1,j1)+crif A[i]B[j][br]T(i,j)=\min \left\{\begin{matrix}[br]T(i-1,j)+c_d & \text{ // delete the i-th charachter of A}\\ [br]T(i,j-1)+c_i & \text{ // insert the j-th charachter of B} \\ [br]T(i-1,j-1) & \text{if } A[i]=B[j]\\ [br]T(i-1,j-1)+c_r & \text{if } A[i] \neq B[j] [br]\end{matrix}\right.


Why is the cost given by the above formula?Could you explain it to me? :confused:
Original post by evinda


According to my notes:
T(i,j)=min{[br]T(i1,j)+cd // delete the i-th charachter of A[br]T(i,j1)+ci // insert the j-th charachter of B[br]T(i1,j1)if A[i]=B[j][br]T(i1,j1)+crif A[i]B[j][br]T(i,j)=\min \left\{\begin{matrix}[br]T(i-1,j)+c_d & \text{ // delete the i-th charachter of A}\\ [br]T(i,j-1)+c_i & \text{ // insert the j-th charachter of B} \\ [br]T(i-1,j-1) & \text{if } A[i]=B[j]\\ [br]T(i-1,j-1)+c_r & \text{if } A[i] \neq B[j] [br]\end{matrix}\right.


Why is the cost given by the above formula?Could you explain it to me? :confused:


Looks like the start of a dynamic programming problem.

You can visualise this as a grid of points running from 0 to m on the x-axis, and 0 to n on the y-axis. Your goal is to reach the point (m,n) by the cheapest route - starting from (0,0).

At each stage you can go one step to the right, or one step up, or one step diagonally up & right.

As in dynamic programming, you start at the point (m,n). Consider T(m,n) to start and how you could have gotten to (m,n) and hopefully all will become clear.
(edited 9 years ago)
Reply 2
Original post by ghostwalker
Looks like the start of a dynamic programming problem.

You can visualise this as a grid of points running from 0 to m on the x-axis, and 0 to n on the y-axis. Your goal is to reach the point (m,n) by the cheapest route - starting from (0,0).

At each stage you can go one step to the right, or one step up, or one step diagonally up & right.

As in dynamic programming, you start at the point (m,n). Consider T(m,n) to start and how you could have gotten to (m,n) and hopefully all will become clear.



For example,if we want to convert the string  exponential \text{ exponential} to the string  polynomial \text{ polynomial},knowing the costs:

cd=ci=2,cr=1 c_d=c_i=2 , c_r=1

I created the following matrix,that is at the attachments.Is it right? If so,is the minimum cost of the conversion then T[11,10]=11 T[11,10]=11 ? Or am I wrong? :confused:
Original post by evinda
For example,if we want to convert the string  exponential \text{ exponential} to the string  polynomial \text{ polynomial},knowing the costs:

cd=ci=2,cr=1 c_d=c_i=2 , c_r=1

I created the following matrix,that is at the attachments.Is it right? If so,is the minimum cost of the conversion then T[11,10]=11 T[11,10]=11 ? Or am I wrong? :confused:


It's going to take me a while to go through that in detail.

From a quick check I got 8, but it could be less.

Notice that cr=1 only if A[i]B[j]c_r=1 \text{ only if } A[i] \neq B[j]

So, for instance, T(11,10) = T(10,9) = T(9,8)=T(8,7) since each time we're adding the same letter to each, "l", "a", "i", respectively.

So, we can effectively forget about the last three letters and just work with converting "exponent" to "polynom"
(edited 9 years ago)
Original post by evinda
...


OK, that was rather horrendous, but here's my table.

As you can see from the bottom right, 8 is the minimum cost.
(edited 9 years ago)
Reply 5
Original post by ghostwalker
OK, that was rather horrendous, but here's my table.

As you can see from the bottom right, 8 is the minimum cost.


Nice!!! :smile:


To find T(3,1) T(3,1),do we have to delete two elements and replace the first one,so ee, by pp?

But,then wouldn't it be equal to 55 ? :confused:


Or have I understood it wrong?
Original post by evinda
Nice!!! :smile:


To find T(3,1) T(3,1),do we have to delete two elements and replace the first one,so ee, by pp?

But,then wouldn't it be equal to 55 ? :confused:


Or have I understood it wrong?


T(3,1) relates to converting "exp" to "p"

So, only need to delete "ex", hence T(3,1)=4

There is more than one route to get there, but that's the shortest.
(edited 9 years ago)
Reply 7
Original post by ghostwalker
T(3,1) relates to converting "exp" to "p"

So, only need to delete "ex", hence T(3,1)=4

There is more than one route to get there, but that's the shortest.


I understand..So,to find T(3,2) T(3,2),do we have to delete one letter(e or x) and replace the one we haven't deleted, by o , since we already have a p? Or can't we do it like that? :confused:
Original post by evinda
I understand..So,to find T(3,2) T(3,2),do we have to delete one letter(e or x) and replace the one we haven't deleted, by o , since we already have a p? Or can't we do it like that? :confused:


We can't do it like that as the p is in the wrong position.

T(3,2) relates to converting "exp" to "po"

We could delete the "ex" to get to "p" and then add the "o" to get to "po" at a cost of 6.

But there is a less costly route, delete the "e" to get to "xp" and then change the two letters to "po" at a cost, in total, of 4.

Edit: It should be possible to determine a route from the table, but I'm just going out now.
Reply 9
Original post by ghostwalker


But there is a less costly route, delete the "e" to get to "xp" and then change the two letters to "po" at a cost, in total, of 4.


How can we change the two letters to "po"? :s-smilie:
Don't we have to replace also at this case both "p" and "o"?
Is it possible to change the position of a letter ,knowing that the only allowed operations are to insert a letter,to delete a letter and to replace a letter ? :confused:
Original post by evinda
How can we change the two letters to "po"? :s-smilie:
Don't we have to replace also at this case both "p" and "o"?


Sorry, I though it was obvious what I meant. We change one letter at a time, which is why the cost is 1+1=2. And 2 for the deletion, making 4.


Is it possible to change the position of a letter ,knowing that the only allowed operations are to insert a letter,to delete a letter and to replace a letter ? :confused:


Yes. By using a combination of the first two you can change the position of a letter, but the cost is cd+ci for each position moved, all other things being equal.

Also I realise you can't determine the optimum route easily without having the costs of each step also recorded in the grid.
Reply 11
Original post by ghostwalker
Sorry, I though it was obvious what I meant. We change one letter at a time, which is why the cost is 1+1=2. And 2 for the deletion, making 4.



Yes. By using a combination of the first two you can change the position of a letter, but the cost is cd+ci for each position moved, all other things being equal.

Also I realise you can't determine the optimum route easily without having the costs of each step also recorded in the grid.


I tried to create again the matrix,that is at the attachments.

Could you tell me why the green boxes are wrong? :confused:

For T(4,6) T(4,6),I thought that we replace the letters e,x,p,o by p,o,l,y and that we insert then the letters n,o.So,the minimum cost is 8 8 ,since if we would keep "po",the cost would be greater.
Am I wrong?
Original post by evinda
I tried to create again the matrix,that is at the attachments.

Could you tell me why the green boxes are wrong? :confused:

For T(4,6) T(4,6),I thought that we replace the letters e,x,p,o by p,o,l,y and that we insert then the letters n,o.So,the minimum cost is 8 8 ,since if we would keep "po",the cost would be greater.
Am I wrong?


Notice that T(3,5) =7 for converting "exp" to "polyn". The next letter in each case is an "o", so no cost, hence T(4,6) is at most 7.

We can do replace e,x,p with p,o,l at a cost of 3, insert y,n at a cost of 4, and then the final o costs nothing.

For T(6,5).

Notice that in your version you have T(6,5) = T(5,5) + 3. Clearly you can find a cheaper route from T(5,5).

PS: If you haven't done so already, I'd write a program to do all this.
(edited 9 years ago)
Reply 13
Original post by ghostwalker
Notice that T(3,5) =7 for converting "exp" to "polyn". The next letter in each case is an "o", so no cost, hence T(4,6) is at most 7.

We can do replace e,x,p with p,o,l at a cost of 3, insert y,n at a cost of 4, and then the final o costs nothing.

For T(6,5).

Notice that in your version you have T(6,5) = T(5,5) + 3. Clearly you can find a cheaper route from T(5,5).



I understand..Thank you very much!!! :smile:

Original post by ghostwalker


PS: If you haven't done so already, I'd write a program to do all this.


That would be nice,I haven't written a program yet.. :colondollar:

Quick Reply

Latest