搜尋 圖片 地圖 Play YouTube 新聞 Gmail 雲端硬碟 更多 »
進階專利搜尋 | 網頁圖片 | 網頁紀錄 | 登入

專利

  
[graphic][merged small][merged small][graphic][table][merged small][merged small][merged small][merged small][merged small][merged small][graphic][merged small]
[graphic][merged small][merged small][graphic][merged small]
[merged small][graphic]

Construct a feature vector for each word or term w in a corpus,
including a count for each feature /in the feature vector

I

Determine the value of each feature/in the feature vector as the pointwise mutual information Ml between the word w and feature/

I

Determine a similarity value between two words or phrases w; and w2 as the cosine of the angle between their feature vectors using the values of the features in the feature vectors

[merged small][merged small][graphic][merged small]
[table][table][merged small][table][merged small]

Segment queries in query logs into one or more query word sequences that maximize overall probability for the query

I

Determine frequent n-grams (n-word sequences) and count the query word sequences where all adjacent pairs of words in the sequence are frequent n-grams

I

Filter out non-compound or non-phrasal word sequences by requiring a compound/phrase to appear at both the beginning and the end of some queries (but not necessarily in the same query)

I

Construct a feature vector for each n-gram in a corpus,
including a count for each feature/in the feature vector

I

Determine value of each feature/in the feature vector as the point-wise mutual information MI between the n-gram and the feature/

70

71

72

73

74

75

Determine a similarity value between two n-grams as the cosine of the angle between their feature vectors using the values of the features in the

feature vectors

I

Generate extraction/contraction table of pairs of compounds where one compound is a substring of another compound with respective counts, e.g., from query logs, or similarity values

[graphic]

76

77

FIG. 8

« 上一頁繼續 »