Siang-tan-ūi

Wikipedia (chū-iû ê pek-kho-choân-su) beh kā lí kóng...
Thiàu khì: Se̍h chāmchhiau-chhoē

Siang-tan-ūi (bigram) tō sī 1 cho͘ ū 2 ê tan-ūi ê chu-liāu, pí-lūn kóng 2 ê jī-bó (letter), 2 ê im-chat (syllable), iah sī 2 ê jī (word). Ēng lâi hun-sek bûn-pún (text) kán-tan koh ū-sū-sái. Tòng-chò gí-giân bô͘-sek (language model) lâi chò gí-im piān-sek (speech recognition) mā kài chán (Collins, 1996). Siang-tan-ūi sǹg sī N-tan-ūi (N-gram) ê 1 ê te̍k-lē.

Hun-lūi[siu-kái]

Làng-phāng siang-tan-ūi (Gappy bigram, skipping bigram) kí 2 ê tan-ūi tiong-ng ū làng-phāng, chhiūⁿ kóng làng koè liân-chiap-jī (connecting word), iah sī kóng ti oá-loā bûn-hoat (dependency grammar) lāi-té beh bô͘-hóng oá-loā ê koan-hē.

Thâu-jī siang-tan-ūi (Head word bigram) tō sī 1 chióng ū bêng-khak oá-loā koan-hē ê làng-phāng siang-tan-ūi.

Lō͘-ēng[siu-kái]

bi̍t-bé-ha̍k (cryptography) ū 1 chióng siang-tan-ūi pîn-lu̍t kong-kek (bigram frequency attack), lī-iōng pîn-lu̍t hun-sek (frequency analysis) lâi kái-phoà àm-bé (cryptogram).

Lí-lūn[siu-kái]

Nā chai-iáⁿ siang-tan-ūi ê ki-lu̍t kap thaû-chêng hit ê tan-ūi ê ki-lu̍t, lán tō ē-tit ēng Bayes tēng-lí (Bayes' theorem) lâi sǹg aū-piah hit ê tan-ūi ê tiâu-kiāⁿ ki-lu̍t:

 P(T_n|T_{n-1}) = { P(T_{n-1},T_n) \over P(T_{n-1}) }

Iā tō sī kóng, nā chai-iáⁿ T_{n-1} ê ki-lu̍t, án-ne T_n ê ki-lu̍t  P() tō sī siang-tan-ūi ê ki-lu̍t  P(T_{n-1},T_n) khì tû-í thâu-chêng tan-ūi T_{n-1} ê ki-lu̍t.


Siong-koan bûn-chiuⁿ[siu-kái]