j*******e 发帖数: 48 | 1 I want to calculate the edit distance between two words, I penalize
insertion, deletions and substitution by 1 unit. My question is, for some
substitutions, for example, "a", its closest neighbors "q, s , z", it is
substitution cost will be 0.5. I have a txt file which contain all the pairs
whose substitution cost will be 0.5,
a q
a z
a s
b v
.....
how to write these cost in perl? I
have the code for the cost=1 case, do not know how to modify the code to
include the new substitution case. Thank | j*******e 发帖数: 48 | 2 Please give me some idea, I really appreciate!!! | b******n 发帖数: 592 | 3 make the cost a function
pairs
【在 j*******e 的大作中提到】 : I want to calculate the edit distance between two words, I penalize : insertion, deletions and substitution by 1 unit. My question is, for some : substitutions, for example, "a", its closest neighbors "q, s , z", it is : substitution cost will be 0.5. I have a txt file which contain all the pairs : whose substitution cost will be 0.5, : a q : a z : a s : b v : .....
| j*******e 发帖数: 48 | 4 Hi, how to write this function? Since pairs.tx contains all the pairs whose
cost is 0.5, all the other cost is 1, I am new to Perl, could you explain
how to write this cost function? Thank you so much!!! | b******n 发帖数: 592 | 5 as i said, if pairs.txt are all 0.5, you can define a hash,key is two
letters
$cost["ab"] = 0.5;
sub costfunc {
my($left, $right) = @_;
if (defined($cost[$left.$right])) {
return 0.5;
} else {
return 1;
}
}
whose
【在 j*******e 的大作中提到】 : Hi, how to write this function? Since pairs.tx contains all the pairs whose : cost is 0.5, all the other cost is 1, I am new to Perl, could you explain : how to write this cost function? Thank you so much!!!
| j*******e 发帖数: 48 | 6 Hi, Bluevian, how to check whether the pair (a1[$j], a2[$j]) is in the pairs
.txt or not? thank you for your help! | b******n 发帖数: 592 | 7 initialise %cost with everything in text file.
each key in cost is a combination of pairs in text file, for example, pair
of "a" and "b" will be "ab"
$key = $a1[$j] . $a2[$j];
if defined $cost{$key}
will tell you if the pairs is in .txt or not.
It is just a idea, and based on my assumption that the pairs in text is
limited, otherwise it will be slow. |
|