ããã«ã¡ã¯ æ€çŽ¢ããµãŒãã¹ããŸãã¯è£œåã«é¢ããåé¡ã¯ãã»ãšãã©ã®ãµã€ãã§çºçããŸãã ãããŠãã»ãšãã©ã®å Žåããã®ãããªæ©äŒã®å®è£
ã¯ãæ€çŽ¢è¡ã«å
¥åãããæ£ç¢ºãªåèªã«ããæ€çŽ¢ã«éå®ãããŸãã
æéãããã顧客ãããå°ã欲ããå Žåã¯ãæã人æ°ã®ããã¢ã«ãŽãªãºã ïŒãã¬ãŒãã³ã·ã¥ã¿ã€ã³è·é¢ãïŒã®å®è£
ãã°ãŒã°ã«ã§æ€çŽ¢ããŠå
¥åããŸãã
ãã®èšäºã§ã¯ãé«åºŠã«ä¿®æ£ãããã¢ã«ãŽãªãºã ã«ã€ããŠèª¬æããŸãããã¬ãŒãã³ã·ã¥ã¿ã€ã³è·é¢ã«åºã¥ããŠãååã«ãããã¡ãžãŒæ€çŽ¢ã®CïŒã³ãŒãã®äŸã瀺ããŸããããšãã°ãã«ãã§ãã¬ã¹ãã©ã³ãç¹å®ã®ãµãŒãã¹ãªã©ã§ããæ§æã®ããã€ãã®åèªïŒ
YandexãMailãProjectArmataãæŠè»ã®äžçãè»èŠã®äžçãè»çšæ©ã®äžçãªã©
ã¢ã«ãŽãªãºã ã«æ
£ããŠããªã人ã®ããã«ãæåã«ã
Implementing Fuzzy Search ããã
Fuzzy Search in Text and Dictionary ãããŸãã¯ã¹ã©ã€ããŒã®äžã®ãã¬ãŒã³ããŒã·ã§ã³ã®èª¬æãèªãããšããå§ãããŸãã
ã¬ãŒãã³ã·ã¥ã¿ã€ã³è·é¢ã¢ã«ãŽãªãºã ã¬ãŒãã³ã·ã¥ã¿ã€ã³è·é¢ã¯ã2ã€ã®åèªã®éãã®çšåºŠãèŠã€ããããã®ãããã¯ãŒã¯ã§æãäžè¬çãªã¢ã«ãŽãªãºã ã§ãã ã€ãŸãã1è¡ç®ãã2è¡ç®ãååŸããããã«å®è¡ããå¿
èŠãããã¢ã¯ã·ã§ã³ã®æå°æ°ã¯äœã§ããã
ãã®ãããªã¢ã¯ã·ã§ã³ã¯3ã€ãããããŸããã
â¢åãå€ã
â¢æ¿å
¥
â¢äº€æ
ããšãã°ã2ã€ã®è¡ãCONNECTããšãCONEHEADãã®å Žåã次ã®å€æããŒãã«ãäœæã§ããŸãã
çè«çã«ã¯ãååŒäŸ¡æ Œã¯æäœã®çš®é¡ïŒæ¿å
¥ãåé€ã亀æïŒããã³/ãŸãã¯é¢é£ããã·ã³ãã«ã«äŸåããå ŽåããããŸãã ããããäžè¬çãªå ŽåïŒ
wïŒaãεïŒ-ã·ã³ãã«ãaããåé€ããäŸ¡æ Œã¯1
wïŒÎµãbïŒ-èšå·ãbããæ¿å
¥ããäŸ¡æ Œã¯1
wïŒaãbïŒ-æåãaããæåãbãã§çœ®ãæããäŸ¡æ Œã¯1
ã©ãã§ããŠãããã眮ããŸãã
ãããŠãæ°åŠã®èŠ³ç¹ããã¯ãèšç®ã¯æ¬¡ã®ããã«ãªããŸãã
S1ãšS2ãã¢ã«ãã¡ãããäžã®2æ¬ã®ç·ïŒããããé·ãMãšNïŒãšããç·šéè·é¢ïŒã¬ãŒãã³ã·ã¥ã¿ã€ã³è·é¢ïŒdïŒS1ãS2ïŒã¯æ¬¡ã®ç¹°ãè¿ãåŒã䜿çšããŠèšç®ã§ããŸãã
ã¬ãŒãã³ã·ã¥ã¿ã€ã³ã¢ã«ãŽãªãºã ã®ãããã©ã«ãã®å®è£
ãã¯ã2人ã®æ°åŠè
WangerãšFisherã«ãã£ãŠè¡ãããŸãã[ãã§ã¹ãã¬ãŒã€ãŒãšæ··åããªãã§ãã ãã]ã
ãããŠãCïŒã§ã¯æ¬¡ã®ããã«ãªããŸãã
private Int32 levenshtein(String a, String b) { if (string.IsNullOrEmpty(a)) { if (!string.IsNullOrEmpty(b)) { return b.Length; } return 0; } if (string.IsNullOrEmpty(b)) { if (!string.IsNullOrEmpty(a)) { return a.Length; } return 0; } Int32 cost; Int32[,] d = new int[a.Length + 1, b.Length + 1]; Int32 min1; Int32 min2; Int32 min3; for (Int32 i = 0; i <= d.GetUpperBound(0); i += 1) { d[i, 0] = i; } for (Int32 i = 0; i <= d.GetUpperBound(1); i += 1) { d[0, i] = i; } for (Int32 i = 1; i <= d.GetUpperBound(0); i += 1) { for (Int32 j = 1; j <= d.GetUpperBound(1); j += 1) { cost = Convert.ToInt32(!(a[i-1] == b[j - 1])); min1 = d[i - 1, j] + 1; min2 = d[i, j - 1] + 1; min3 = d[i - 1, j - 1] + cost; d[i, j] = Math.Min(Math.Min(min1, min2), min3); } } return d[d.GetUpperBound(0), d.GetUpperBound(1)]; }
ããããæ®åœ± ã
ãããŠãèŠèŠçã«ãArestantãããã³ãDagestanããšããåèªã«å¯Ÿããã¢ã«ãŽãªãºã ã®åäœã¯æ¬¡ã®ããã«è¡šãããŸãã
ãããªãã¯ã¹ã®å³äžé
ã¯ãåèªã®éãã瀺ããŠããŸãã ãããªãã¯ã¹ããããã®ç¹å®ã®ã±ãŒã¹ã§ã¯ãåèªéã®éãã¯3ã€ã®æ¡ä»¶ä»ããªãŠã ã§ãã
æ確ã§ãªãå Žåã¯ããããã®åèªãå¥ã®ãã¥ãŒã§èŠãŠãã ããã
_ A
R E S T A N T
D A
G E S T A N _
ãããã£ãŠãåèªãArestantããšãDagestanããæå¹ã«ããã«ã¯ã1ã€ã®æåãDããè¿œå ãã1ã€ã®æåãPãããGãã«çœ®ãæããæåãTããåé€ããå¿
èŠããããŸãã ãªããªã ãã¹ãŠã®ã¢ã¯ã·ã§ã³ã®éã¿ã¯1ã§ãèšèã®éãã¯3ãªãŠã ã§ãã
åèªãå®å
šã«äžèŽããå Žåãè·é¢ã¯0ã«ãªããŸãããããçè«å
šäœã§ãããç¬åµçãªãã®ã¯ãã¹ãŠåçŽã§ãã
ãããŠãããã¯èŠããã ãã-ããã«ããïŒ ãã¹ãŠãç§ãã¡ã®ããã«çºæãããäœãããããšã¯ãããŸããããåé¡ããããŸã...
1ïŒã¿ã€ããã¹ã®å¯èœæ§ã¯ãããŒããŒãäžã®ããŒã®è·é¢ãé³å£°ã°ã«ãŒããããã³éåžžã®ã¿ã€ãã³ã°é床ã«äŸåãããããéã¿ä¿æ°ã1ã«ããããšã¯åžžã«å®å
šã«æ£ãããšã¯éããŸããã
2ïŒã¬ãŒãã³ã·ã¥ã¿ã€ã³ã®ã¢ã€ãã¢ã¯ãåèªã®äžéšã§ã¯ãªããåèªã®éããèŠã€ãããããšãç®çãšããŠããŸããããã¯ãæåãå
¥åãããšãã«åçã«çæãããçµæã«ãšã£ãŠéèŠã§ãã
3ïŒãµãŒãã¹ã®ååã«ã¯ãæ§æã«è€æ°ã®åèªãå«ãŸããŠããããã人ã¯åçŽã«é åºãèŠããŠããªãå ŽåããããŸãã
ãŸãã次ã®ãããªããã€ãã®èŠçŽ ãèæ
®ããŸãã
â¢å¥ã®èšèªã®ããŒããŒãã¬ã€ã¢ãŠã
â¢æåã®é³èš³
ãããã¯ããã®èšäºã§è§£æ±ºããããšããåé¡ã§ãã
ãŸãããã¹ãŠã®åèªã1ã€ã®ã¬ãžã¹ã¿ã«ã€ãªããããšã«åæããŸãããã ç§ã®ããŒãžã§ã³ã®ã³ãŒãã§ã¯ãå°æåãéžæããŸãããããã¯ãå¿
èŠãªåç
§ã«åæ ãããŸãïŒãããã®åç
§ã¯ã説æã®éçšã§äžããããŸãïŒã èšäºèªäœã§ã¯ãCamelCase-ProjectArmataãªã©ãããŸããŸãªæžäœã«é ŒããŸãããããã¯äººéã®ç¥èŠã®äŸ¿å®ã®ããã ãã«è¡ãããŸããåæã®èŠ³ç¹ãããããšãã¬ãžã¹ã¿ã¯1ã€ïŒäžïŒã§ãã ãããŠãŸã ãç§ãã¡ã¯ããŒã¹ãšããŠå€å
žã§ã¯ãªãã
ããããã¬ãŒãã³ã·ã¥ã¿ã€ã³è·é¢ãèŠã€ããããã®ã³ãŒãã®æé©åãããããŒãžã§ã³ãåã
ãŸã ïŒ
èªé ã®å€æŽãåé€ããããšã«ãããããããããã«ä¿®æ£ããŸãã ã¬ãŒãã³ã·ã¥ã¿ã€ã³ã®ã¢ã«ãŽãªãºã ã§ã¯ãèªé ã¯éèŠã§ã¯ãããŸããããç§ãã¡ã«ãšã£ãŠã¯ä»ã®çç±ã§éèŠã§ãã ãã®çµæã次ã®ãã®ãåãåããŸããã
public int LevenshteinDistance(string source, string target){ if(String.IsNullOrEmpty(source)){ if(String.IsNullOrEmpty(target)) return 0; return target.Length; } if(String.IsNullOrEmpty(target)) return source.Length; var m = target.Length; var n = source.Length; var distance = new int[2, m + 1];
éã¿ä¿æ°ãå€æŽããããšã«ãããæ€çŽ¢ã¢ã«ãŽãªãºã ãæ¹åãå§ããŸãã ãŸããæ¿å
¥ããã³åé€ã®èŠçŽ ã¯2ã§ãã
ã€ãŸã è¡ãå€æŽãããŸãïŒ
for(var j = 1; j <= m; j++) distance[0, j] = j * 2; ... distance[currentRow, 0] = i * 2; ... distance[previousRow, j] + 2 ... distance[currentRow, j - 1] + 2
ãããŠãæåã®çœ®æä¿æ°ãèšç®ããè¡ããããŸããé¢æ°CostDistanceSymbolãæå¹ã«ããŸãïŒ
var cost = (target[j - 1] == source[i - 1] ? 0 : 1);
ãããŠã2ã€ã®èŠå ãæ€èšããŸãã
1ïŒããŒããŒãã®è·é¢
2ïŒé³å£°ã°ã«ãŒã
ãã®ç¹ã§ããœãŒã¹ãªããžã§ã¯ããšã¿ãŒã²ãããªããžã§ã¯ãã®ãã䟿å©ãªäœæ¥ã®ããã«ããããããªããžã§ã¯ãã«å€æããŸãã
class Word {
ããã«ã¯ã次ã®è£å©ã¬ã€ããå¿
èŠã§ãã
ãã·ã¢èªããŒããŒãã®ããŒã³ãŒãã®æ¯çïŒ
private static SortedDictionary<char, int> CodeKeysRus = new SortedDictionary<char, int> { { '' , 192 }, { '1' , 49 }, { '2' , 50 }, ... { '-' , 189 }, { '=' , 187 }, { '' , 81 }, { '' , 87 }, { '' , 69 }, ... { '_' , 189 }, { '+' , 187 }, { ',' , 191 }, }
è±èªããŒããŒãã®ããŒã³ãŒãæ¯
private static SortedDictionary<char, int> CodeKeysEng = new SortedDictionary<char, int> { { '`', 192 }, { '1', 49 }, { '2', 50 }, ... { '-', 189 }, { '=', 187 }, { 'q', 81 }, { 'w', 87 }, { 'e', 69 }, { 'r', 82 }, ... { '<', 188 }, { '>', 190 }, { '?', 191 }, };
æ°åŠèšèªã§è©±ããããã®2ã€ã®ãã£ã¬ã¯ããªã®ãããã§ã2ã€ã®ç°ãªãã·ã³ãã«ã¹ããŒã¹ã1ã€ã®ãŠãããŒãµã«ã«å€æã§ããŸãã
ãŸãã次ã®é¢ä¿ãæå¹ã§ãã
private static SortedDictionary <intãList> DistanceCodeKey = new SortedDictionary <intã
List<int>> { { 192, new List<int>(){ 49 }}, { 49 , new List<int>(){ 50, 87, 81 }}, { 50 , new List<int>(){ 49, 81, 87, 69, 51 }}, ... { 189, new List<int>(){ 48, 80, 219, 221, 187 }}, { 187, new List<int>(){ 189, 219, 221 }}, { 81 , new List<int>(){ 49, 50, 87, 83, 65 }}, { 87 , new List<int>(){ 49, 81, 65, 83, 68, 69, 51, 50 }}, ... { 188, new List<int>(){ 77, 74, 75, 76, 190 }}, { 190, new List<int>(){ 188, 75, 76, 186, 191 }}, { 191, new List<int>(){ 190, 76, 186, 222 }}, };
ã€ãŸã å¥ã®ããŒã®åšãã«ç«ã£ãŠããããŒãåããŸãã å³ã®äŸã«ãã£ãŠããã確èªã§ããŸãïŒ
ã¯ããQWERTYããŒããŒãã«å ããŠãä»ã®ã¬ã€ã¢ãŠãããããããŒããŒãè·é¢ããŒãã«ãäžèŽããªãããšãç¥ã£ãŠããŸãããæãäžè¬çãªãªãã·ã§ã³ã䜿çšããŸãã
誰ããããè¯ãæ¹æ³ãç¥ã£ãŠãããªããæžããŠãã ããã
ããã§ãæåã®ã¹ããã-CostDistanceSymbolé¢æ°ã§ãšã©ãŒã®ããã·ã³ãã«ã®éã¿ãèšç®ããæºåãã§ããŸããã
åãšåæ§ã«ãæåãåãå Žåãè·é¢ã¯0ã§ãã
if (source.Text[sourcePosition] == target.Text[targetPosition]) return 0;
ããŒã³ãŒããåãå Žåãè·é¢ã0ã§ãã
if (source.Codes[sourcePosition] != 0 && target.Codes[targetPosition] == target.Codes[targetPosition]) return 0;
æåãæ¯èŒããåŸã«ããŒã³ãŒããæ¯èŒããçç±ãããããªãå Žåãçãã¯ç°¡åã§ãããWaterããšãDjlfããšããèšèãåãããã«ç解ããŠãããããã®ã§ãã ãŸããã¬ã€ã¢ãŠãã«é¢ä¿ãªããç°ãªãã¬ã€ã¢ãŠãã§å
¥åã§ããæåãããšãã°ã;ããããããåãããã«èªèãããŠããŸããã
ããã«ãããŒã³ãŒããäºãã«ã©ãã ãè¿ããã«ã€ããŠã¯ããã§ã«ã³ãŒãã ããèŠãŠãããŸãã è¿ãå Žåãè·é¢ã¯1ãããã§ãªãå Žåã2ïŒæ¿å
¥ãŸãã¯åé€æãšãŸã£ããåãééïŒïŒ
int resultWeight = 0; List<int> nearKeys; if (!DistanceCodeKey.TryGetValue(source.Codes[sourcePosition], out nearKeys)) resultWeight = 2; else resultWeight = nearKeys.Contains(target.Codes[searchPosition]) ? 1 : 2;
å®éããã®ãããªå°ããªæ¹è¯ã¯ãééã£ãã¬ã€ã¢ãŠãããå§ãŸããããŒã«ãããã¹ã§çµããèšå€§ãªæ°ã®ã©ã³ãã ãšã©ãŒãã«ããŒããŸãã
ããããã¿ã€ããã¹ã®åŸã§ã¯ã人ã¯åèªã®ã€ã¥ãæ¹ãç¥ããªããããããŸããã äŸïŒãã€ã¯ãäžåã ãã¹ãã
ãã®ãããªèšèã¯ãã·ã¢èªã ãã§ãªããè±èªã§ããããŸãã ãã®ãããªå Žåãèæ
®ããå¿
èŠããããŸãã è±èªã®åèªã«ã€ããŠã¯
ãé³å£°ã°ã«ãŒãã«é¢ããZobelãšDarthã®ç 究ãåºæ¬
ãšããŸã ã
ãAeiouyãããbpãããckqãããdtãããlrãããmnãããgjãããfpvãããsxzãããcszã
ãããŠããã·ã¢äººã®ããã«ãç§ã¯èªåèªèº«ã§äœæ²ããŸãïŒ
ãYyãããeeeãããayãããoyeãããuyãããshchãããoaãããyoã
ãããã®é³å£°ã°ã«ãŒããã¿ã€ãã®ãªããžã§ã¯ãã«å€æããŸãã
PhoneticGroupsEng = { { 'a', { 'e', 'i', 'o', 'u', 'y'} }, { 'e', { 'a', 'i', 'o', 'u', 'y'} } ... }
ããã¯æäœæ¥ã§ãã³ãŒãã®èšè¿°æ¹æ³ã§ãè¡ããŸãããçµæã¯åãã§ãã ãããŠä»ãããŒã³ãŒãããã§ãã¯ããåŸãåã®ã¹ããããšåããšã©ãŒãèŠã€ããããã®ããžãã¯ã§é³å£°ã°ã«ãŒãã«å
¥ãããã®æåããã§ãã¯ããããšãã§ããŸãïŒ
List<char> phoneticGroups; if (PhoneticGroupsRus.TryGetValue(target.Text[targetPosition], out phoneticGroups)) resultWeight = Math.Min(resultWeight, phoneticGroups.Contains(source.Text[sourcePosition]) ? 1 : 2); if (PhoneticGroupsEng.TryGetValue(target.Text[targetPosition], out phoneticGroups)) resultWeight = Math.Min(resultWeight, phoneticGroups.Contains(source.Text[sourcePosition]) ? 1 : 2);
äžèšã®ã¿ã€ããã¹ã«å ããŠãããã¹ãã®ãã¿ã€ãã³ã°é床ãã®ã¿ã€ããã¹ããããŸãã ããã¯ã2ã€ã®é£ç¶ããæåãå
¥åããããšãã«æ··åãããå Žåã§ãã ããã«ãããã¯ããçš®ã®æ°åŠè
ãã¬ããªãã¯ã»ãã¡ã©ãŠãæåã®è»¢çœ®ïŒé åïŒã®æäœãè¿œå ããããšã§ã¬ãŒãã³ã·ã¥ã¿ã€ã³ã¢ã«ãŽãªãºã ãå®æããããšããããããééãã§ãã
ãœãããŠã§ã¢ã®èŠ³ç¹ãããLevenshteinDistanceé¢æ°ã«æ¬¡ãè¿œå ããŸãã
if (i > 1 && j > 1 && source.Text[i - 1] == target.Text[j - 2] && source.Text[i - 2] == target.Text[j - 1]) { distance[currentRow, j] = Math.Min(distance[currentRow, j], distance[(i - 2) % 3, j - 2] + 2); }
çºèšåºç€ãšããŠæ¡çšããæé©åã³ãŒãã«ã¯ãè·é¢è¡åãåæåãã次ã®åœ¢åŒããããŸããvar distance = new int [2ãm + 1];
ãããã£ãŠããdistance [ïŒi-2ïŒïŒ
3ã...ããšããã³ãŒãã®ãã®ã»ã¯ã·ã§ã³ã¯çŸåšã®åœ¢åŒã§ã¯æ©èœããŸãããèšäºã®æåŸã«æ£ããããŒãžã§ã³ã®é¢æ°ã瀺ããŸãã
ãããã£ãŠãã³ãŒããå®æãããæåã®ã¹ããããå®äºããŸããã 2çªç®ã®ãã€ã³ãã«é²ã¿ãŸãã ã¬ãŒãã³ã·ã¥ã¿ã€ã³ã®ã¢ã€ãã¢ã¯ãåèªã®äžéšã§ã¯ãªãããåèªãåºå¥ãããããšãç®çãšããŠããããšãæãåºããŠãã ãããããã¯ãæåãå
¥åããéã®åçãªåºåã«ãšã£ãŠéèŠã§ãã
ããšãã°ããã£ã¬ã¯ããªã«ã¯2ã€ã®åèªããããŸãã
â¢ãProjectArmataã
â¢ãã€ã¯ã
ã¯ãšãªãProããæ€çŽ¢ããŒã«å
¥åãããšã2ã€ã®æåã眮ãæããŠ1ã€ãåé€ããã ãã§3ã€ã®ãªãŠã ã«ãªãããããã€ã¯ããåªå
床ã®é«ããã®ã«ãªããŸãïŒãŠãããä¿æ°ãšåŸæ¥ã®ã¬ãŒãã³ã·ã¥ã¿ã€ã³ã¢ã«ãŽãªãºã ãèæ
®ããã«å€æŽããŸãïŒã ãjectArmataããšããåèªã®äžéšã10åã®ãªãŠã ã«è¿œå ããŸãã
ãã®ç¶æ³ã§æãè«ççãªã®ã¯ãåèªå
šäœã§ã¯ãªããæ€çŽ¢ãããåèªã®äžéšã®ã¿ãå
¥åãããæååãšæ¯èŒããããšã§ãã
ãªããªã æ€çŽ¢ã¯ãšãªã¯ãProããšãã3æåã§æ§æãããŠããŸããæ¯èŒå¯Ÿè±¡ã®ãProjectArmataããšããåèªããæåã®3æåãååŸããŸãã ãããããš100ïŒ
ã®äžèŽãååŸããŸãã ãã®ç¹å®ã®å Žå-å®ç§ã ããããããã«ããã€ãã®ãªãã·ã§ã³ãèŠãŠã¿ãŸãããã ããŒã¿ããŒã¹ã«æ¬¡ã®åèªã»ããããããšããŸãã
â¢ãå
±åã
â¢ãã³ã³ãã€ãŒã
â¢ãã³ãããŒã
â¢ãå粧åã
æ€çŽ¢ã¯ãšãªã¯ãComãã®ããã«ãªããŸãã ãã®çµæãåèªã®äžèŽçã¯æ¬¡ã®ããã«ãªããŸãã
å
±å-0
ã³ã³ãã¢-1
ã³ãããŒ-1
å粧å-1
ãKommunalkaããšããåèªã§ãã¹ãŠãé 調ã§ããã°ãä»ã®3ã€ã®åèªã¯äœããã®åœ¢ã§çµ±äžãããŠããããã«èŠããŸãã ãããŠç§ãã¡ã®ä»äºã¯ããã¹ãŠãäžåã«äžŠã¹ãã®ã§ã¯ãªãã圌ã«æé©ãªçµæãäžããããšã§ãã ããã«ããã¡ã©ãŠãèšã£ãããã«ãã»ãšãã©ã®ééãã¯æåã®é åã§ãã
ãã®ãããªééãããªããããã«ãå°ããªä¿®æ£ãè¡ããŸãã
æåã®ãnãæåã§ã¯ãªãããn + 1ãæåãåããŸããnã¯ã¯ãšãªå
ã®æåæ°ã§ãã ãããŠããComãã®ãªã¯ãšã¹ãã§ã®ä¿æ°ã¯æ¬¡ã®ããã«ãªããŸãã
å
±å-1
å粧å-1
ã³ã³ãã€ãŒ-2
ã³ãããŒ-2
ãå粧åãã¯ä¿®æ£ãããŸãããããã³ã ã ãã«ã«ãã¯æ®ããŸãã...ããããç§ã¯æ¬¡ã®çç±ã§ãã®ãªãã·ã§ã³ããã奜ãã§ãã éåžžãæ€çŽ¢ãå¿
èŠãªå Žåãæåã«æåãå
¥åãããšãæ€çŽ¢ããŒã®äžã«ããããããŠã³ãªã¹ãã®åœ¢åŒã§ãŠãŒã¶ãŒã«æ
å ±ã衚瀺ãããŸãã ãã®ããããããŠã³ãªã¹ãã®é·ãã¯ã3ã7ãšã³ããªã®ãµã€ãºã«ãã£ãŠå¶éãããŸãã ãã®çµæã3ã€ã®ãšã³ããªãããªãå Žåã2çªç®ã®ããŒãžã§ã³ã§ã¯ãKommunalkaãããCosmeticsãããConveyorãã衚瀺ãããŸã[as ããã¯ãguidã®ããã«ããŸãã¯åã«äœææ¥ã®ããã«ãæåã®çºè¡ã§ã¯æãã§ãã ãããŠãæåã®ã±ãŒã¹ã§ã¯ããKommunalkaãããConveyorãããColonyããããããCosmeticsãã¯ãããŸããã 圌女ã¯ä»ã®çç±ã§äžéã§ãã...
ãã¡ããããã®åé¡ã«ã¯ä»ã®è§£æ±ºçããããŸãã ããšãã°ãæåã«ãnãæåã§ãœãŒãããŠãããã€ã³ããã¯ã¹ãäžèŽããåèªã®ã°ã«ãŒããååŸããããã«ãn + 1ãæåãåãœãŒããããšãäœåºŠãäœåºŠãåãœãŒãã§ããŸã...解決ãããåé¡ãèšç®èœåã
ä»ãäžèšã®åé¡ã®è§£æ±ºã«çŠç¹ãåœãŠãªãã§ãã ãã...ç§ãã¡ã¯éã®çã£inäžã«ããã ãã§ããŸã äŒãããããšããããŸãã
æ£ããæ€çŽ¢çµæã®æ¬¡ã®ãã¥ã¢ã³ã¹ã¯ããœé£ã®æ代ããããããããŸãã ãã®åŸãããã€ãã®åèªã1ã€ã«ãŸãšããååãäœãã®ã奜ãã§ããã ã¯ããä»ã§ã¯ãããé¢é£ããŠããŸãïŒ
æ¶è²»è
çµå
GazPromBank
ãã·ã¢èŸ²æ¥éè¡
Projectarmata
éè¡URALSIB
ç
[ps ååã®äžéšãã¹ããŒã¹ã§æžãããŠããããšã¯ç¥ã£ãŠããŸãããé®®æãªäŸãåãäžããå¿
èŠããããŸãã]
ããããã¢ã«ãŽãªãºã ã«åŸãã°ãåžžã«åèªã®æåã®ãn + 1ãæåãååŸããŸãã ãŸãããBankããšããåèªãå
¥åãããšãé©åãªåŒãæž¡ãã¯è¡ãããŸããã ãªããªã è¡ãæ¯èŒããŸãã
BankU
ããã¬
ã¬ã¹ãã«
ããã»
ããããžã§ã¯ãã
äœçœ®ã«é¢ä¿ãªããbankããšããåèªãèŠã€ããã«ã¯ããã¬ãŒãºããšã«ãããŒãã£ã³ã°ãŠã£ã³ããŠãäœæããæäœã®ä¿æ°ãè¿ãå¿
èŠããããŸãã
double GetRangeWord(Word source, Word target) { double rangeWord = double.MaxValue; Word croppedSource = new Word(); int length = Math.Min(source.Text.Length, target.Text.Length + 1); for (int i = 0; i <= source.Text.Length - length; i++) { croppedSource.Text = target.Text.Substring(i, length); croppedSource.Codes = target.Codes.Skip(i).Take(length).ToList(); rangeWord = Math.Min(LevenshteinDistance(croppedSource, target) + (i * 2 / 10.0), rangeWord); } return rangeWord; }
ã芧ã®ãšãããã¬ãŒãã³ã·ã¥ã¿ã€ã³è·é¢ãèšç®ããåŸã«åŸããã誀差ã«ãå¥ã®å€ïŒi * 2 / 10.0ïŒãè¿œå ããŸãã ãi * 2ãã®å Žå-ãã¹ãŠãæ確ãªå Žåãããã¯ãã¬ãŒãã³ã·ã¥ã¿ã€ã³è·é¢ãèŠã€ããããã®å€å
žçãªã¢ã«ãŽãªãºã ã®ããã«ãåèªã®å
é ã«æåãæ¿å
¥ãããšã©ãŒã§ããããªã10ã§å²ãã®ã§ããïŒ èŠããã«ããi * 2ãã ããæ®ããšãçãåèªã®é·ããçããªããåã³éè¡ã®ååãæ®ãããšã«ãªããŸãã ãããã£ãŠãä¿æ°ã10ã§é€ç®ããå¿
èŠããããããã«ãããã®ãã€ã¢ã¹ãæžå°ããŸãã ãªãã¡ããã©10 ç§ãã¡ã®ããŒã¹ã«ã€ããŠã¯ãéåžžã¯å€å°é©åããŸãããããå€ãã«åå²ã§ããããšãé€å€ããŸããã ãã¹ãŠã¯ãåèªã®é·ããšåèªã®é¡äŒŒæ§ã«äŸåããŸãã æé©ãªä¿æ°ã®èšç®ã«ã€ããŠã¯å°ãåŸã§èª¬æããŸãã
æ€çŽ¢ãŠãããã®ããã«ã䞊ã¹æ¿ããããåèªã§ããã¬ãŒãºã«ç§»åããŸãã ãããŠããŸãæåã«ãããã€ãã®äŸã瀺ããŸãã
â¢ãŠã©ãŒãã§ã€ã¹
â¢æŠè»ã®äžç
â¢è»çšæ©ã®äžç
â¢è¹ã®äžç
ãã¬ãŒãºã®æ€çŽ¢ããåºæ¬çã«å¿
èŠãªãã®ãç解ããŸãããã æ¬ ããŠããåèªãè¿œå ããããäžèŠãªãã¬ãŒãºãåé€ããããå Žæãå€ãããããå¿
èŠããããŸãã ãã§ã«ã©ããã§ãããèšã£ãŠããŸãã...ãããŠã確ãã«ãç§ã¯æãåºããŸãã...èšäºã®åé ã§ãã¬ãŒãã³ã·ã¥ã¿ã€ã³ã®èšèãšè·é¢é¢æ°ã«ã€ããŠè©±ããŸããã ããªãã®æéãç¡é§ã«ããªãããã«ãç§ã¯ããã«ãããçŽç²ãªåœ¢ã§äœ¿çšããããšã¯ã§ããªãã£ããšèšããŸãããããã«è§ŠçºãããŠããã¬ãŒãºã«é©çšã§ããã³ãŒããæžãããšãã§ããŸããã
ã¬ãŒãã³ã·ã¥ã¿ã€ã³è·é¢é¢æ°ã®å®è£
ãšåæ§ã«ããã¬ãŒãºã®1ã€ã空ã®å Žåããã¹ãŠã®æåã®æ¿å
¥ãŸãã¯åé€[空ã®ãã¬ãŒãºãã©ã¡ãåŽããæ¥ããã«ãã]ã«çãããšã©ãŒå€ãè¿ããŸãã
if (!source.Words.Any()) { if (!search.Words.Any()) return 0; return search.Words.Sum(w => w.Text.Length) * 2 * 100; } if (!search.Words.Any()) { return source.Words.Sum(w => w.Text.Length) * 2 * 100; }
ã€ãŸã ãã¬ãŒãºå
ã®æåæ°ãèŠçŽãã2ãæããŸãïŒãã®ä¿æ°ã¯æåã®èšäºã®åé ã§éžæããŸããïŒã100ãæããŸãããããã®100ã¯ãã¢ã«ãŽãªãºã å
šäœã§æãççŸããä¿æ°ã§ãã ãªãå¿
èŠãªã®ãã以äžã§ããæ確ã«ç€ºãããã®åŸãçè«çã«ã¯å€©äºããã ãã§ãªããèšç®ããå¿
èŠãããããšã説æããŸãã
double result = 0; for (int i = 0; i < search.Words.Count; i++) { double minRangeWord = double.MaxValue; int minIndex = 0; for (int j = 0; j < source.Words.Count; j++) { double currentRangeWord = GetRangeWord(source.Words[j], search.Words[i], translation); if (currentRangeWord < minRangeWord) { minRangeWord = currentRangeWord; minIndex = j; } } result += minRangeWord * 100 + (Math.Abs(i - minIndex) / 10.0); } return result; }
ãã®ã³ãŒãã§ã¯ãããŒã¿ããŒã¹å
ã®ã¬ã³ãŒãã®ååèªãåæ€çŽ¢ã¯ãšãªãšæ¯èŒããååèªã®æäœä¿æ°ãååŸããŸãã ãã¬ãŒãºã®åèšä¿æ°ã¯æ¬¡ã®ãšããã§ãã
result + = minRangeWord * 100 +ïŒMath.AbsââïŒi-minIndexïŒ/ 10.0ïŒ;
ã芧ã®ããã«ãããžãã¯å
šäœã¯äžèšãšåãã§ããåèªminRangeWordã®æå°ä¿æ°ã«100ãæããŠãåèªãæé©ãªäœçœ®ã«ã©ãã ãè¿ããã瀺ãä¿æ°ãè¿œå ããŸãïŒMath.AbsââïŒi-minIndexïŒ/ 10.0ïŒã
åã®ã¹ãããã§æ€çŽ¢èªã®æé©ãªäœçœ®ãæ€çŽ¢ãããšãã«çºçããå¯èœæ§ã®ããè¿œå ã®ä¿æ°ãè£æ£ããããã«ã100ã®ä¹ç®ã䜿çšãããŸãã ãã®çµæããã®ä¿æ°ã¯ãæ€çŽ¢è¡ã®ãã¬ãŒãºãšããŒã¿ããŒã¹å
ã®ãã¹ãŠã®åèªã®éã®æ倧å€ãšããŠèšç®ã§ããŸãã ãã¬ãŒãºã§ã¯ãªããèšèã§ã ãã ãããã®ããã«ã¯ãå€æŽãå ããŠã¬ãŒãã³ã·ã¥ã¿ã€ã³è·é¢ãã空ã«ãããå¿
èŠããããŸãããããã¯éåžžã«ãªãœãŒã¹ã浪費ããŸãã
ã€ãŸã GetRangeWordé¢æ°ãå®è¡ãããã¬ãŒãºãi * 2ãã®æé©ãªå Žæããã®åå·®ã®æ倧å€ãååŸããŸãã ãããŠãæé«å€ãååŸããåŸãæãè¿ã10åã®æ°å€ïŒ10ã100ã1000ã10000ã100000ãªã©ïŒã«ç§»åããŸãã ãããã£ãŠã2ã€ã®å€ãååŸããŸãã
æåã®å€ã¯ãGetRangeWordé¢æ°ã§æ··åèªãåå²ããå€ã§ãã 次ã«ãåã®ãªãã»ãããè£æ£ããããã«minRangeWordãä¹ç®ããå€ã ãããã£ãŠããã¬ãŒãºã®é¡äŒŒæ§ã®æ£ç¢ºãªææšãååŸããŸãã ããããå®éã«ã¯ã倧ããªåå·®ãç¡èŠããå¹³åã倧ãŸãã«èŠç©ããããšãã§ããŸã...ç§ã¯å®éã«ãããè¡ããŸããã
ååãšããŠããã¹ãŠã ç§ãæŽçããäž»ãªåé¡ã¯ããæåã®é³èš³ãã®ã»ãã®å°ãã®æ¹è¯ã§ãã äžèšã®æ€çŽ¢ãšé³èš³ã®æ€çŽ¢ã®éãã¯ãCostDistanceSymbolé¢æ°ã§ã¯ããŒè·é¢ã«å¿ããŠå¿çå€ã調æŽããªãããšã§ãã ãã®å Žåã®çºè¡ã¯æ£ãããããŸããã
ãŸããæ€çŽ¢æååã®3æåã§çµæãé©åã«è¿ãããããšã«ã泚æããŠãã ãããæåæ°ãå°ãªãå Žåã¯ãæååãæ£ç¢ºã«äžèŽãããããäžåšçšãªæ¹æ³ã䜿çšããããšããå§ãããŸãã
次ã«ãäžèšã®ãã¹ãŠã®æãå®å
šãªã³ãŒããæäŸããŸãããæåã«ïŒ
1ïŒãã®ã³ãŒãã¯ãç§ãèªç±ãªæéã«å人çã«æžãããã®ã§ãã
2ïŒã¢ã«ãŽãªãºã ã¯ãç§ãèªç±æéã«å人çã«èãããã®ã§ãã
ãããšã¯å¥ã«ããªã³ã¯ãšã€ã³ã¹ãã¬ãŒã·ã§ã³ã«æè¬ããŸããDmitryPanyushkinãPavel Grigorenkoã
èšäºã«èšèŒãããŠããååã¯ããªãŒãã³ãœãŒã¹ããååŸããããã®ã§ãããææè
ã®ãã®ã§ãã åºåã§ã¯ãããŸããã
èªãã§ãããã¿ããªã«æè¬ããŸãã æ¹å€ãã¢ããã€ã¹ãæè¿ããŸãã
å®å
šãªã³ãŒã public class DistanceAlferov { class Word { public string Text { get; set; } public List<int> Codes { get; set; } = new List<int>(); } class AnalizeObject { public string Origianl { get; set; } public List<Word> Words { get; set; } = new List<Word>(); } class LanguageSet { public AnalizeObject Rus { get; set; } = new AnalizeObject(); public AnalizeObject Eng { get; set; } = new AnalizeObject(); } List<LanguageSet> Samples { get; set; } = new List<LanguageSet>(); public void SetData(List<Tuple<string, string>> datas) { List<KeyValuePair<char, int>> codeKeys = CodeKeysRus.Concat(CodeKeysEng).ToList(); foreach (var data in datas) { LanguageSet languageSet = new LanguageSet(); languageSet.Rus.Origianl = data.Item1; if (data.Item1.Length > 0) { languageSet.Rus.Words = data.Item1.Split(' ').Select(w => new Word() { Text = w.ToLower(), Codes = GetKeyCodes(codeKeys, w) }).ToList(); } languageSet.Eng.Origianl = data.Item2; if (data.Item2.Length > 0) { languageSet.Eng.Words = data.Item2.Split(' ').Select(w => new Word() { Text = w.ToLower(), Codes = GetKeyCodes(codeKeys, w) }).ToList(); } this.Samples.Add(languageSet); } } public List<Tuple<string, string, double, int>> Search(string targetStr) { List<KeyValuePair<char, int>> codeKeys = CodeKeysRus.Concat(CodeKeysEng).ToList(); AnalizeObject originalSearchObj = new AnalizeObject(); if (targetStr.Length > 0) { originalSearchObj.Words = targetStr.Split(' ').Select(w => new Word() { Text = w.ToLower(), Codes = GetKeyCodes(codeKeys, w) }).ToList(); } AnalizeObject translationSearchObj = new AnalizeObject(); if (targetStr.Length > 0) { translationSearchObj.Words = targetStr.Split(' ').Select(w => { string translateStr = Transliterate(w.ToLower(), Translit_Ru_En); return new Word() { Text = translateStr, Codes = GetKeyCodes(codeKeys, translateStr) }; }).ToList(); } var result = new List<Tuple<string, string, double, int>>(); foreach (LanguageSet sampl in Samples) { int languageType = 1; double cost = GetRangePhrase(sampl.Rus, originalSearchObj, false); double tempCost = GetRangePhrase(sampl.Eng, originalSearchObj, false); if (cost > tempCost) { cost = tempCost; languageType = 3; }
Vitaly Alferovã2017