Semantic analysis of the Keywords & Keyphrases
The issue that we will be about ourselves to face in this section is that one to
try one possible the following solution to problem: A machine can "comprise" a (Key)phrase?
Definition of the problem:
Let be N a set of n phrases { F }, It
is wanted to be obtained a set of couples { (F i
, C i ) } where
Ci
is the
"category" (meant) to which the phrase Fi belongs.
The
categories are not knowns "a priori" but must be determined
by a statistical analysis of the phrases supplied in input.
Applications:
A possible application is that one to obtain "intelligent
and dynamics" statistics of the keyphrases (and keyword) inserted in the search
engines in order to reach our web sites.
Many specific web site offers statistics services of accesses to a
URL. They offer detailed reports around the phrases used (referrer) by
visitors
in the search engines in order to reach web site, but they
often bring back only the keyphrase and the number of times (Hit) that it is
used in order to arrive to the URL.
It would be much more interesting to get a report that
groups, dynamically, the totality of the phrases in few categories (and
even subcategories) with totals obtained adding the Hits. A report like this would
be fundamental for the
webmaster that it wishes to understand which are the sections of website that
are more visited. In this way the website manager can adopt the best choices of optimization
and web marketing.
Analyses of the (Italian Language) phrases used in the search engines:
Fortunately the phrases inserted in the form of search have common
characteristics that they can facilitate the logical analysis: the main ones are:
A possible algorithm in order to groups keyphrases should be composed by two phases: a first one (learning phase) is necessary to build a database of related keyprhases. In the second phase the the detection of the relatives meants by statistical operations:
Phase 1: "Learning" :
for each phrase in period
for each word in phrase if is_not_common(word) then if Not Objects.Exists(word) then Objects.Add(word, related, 1) else Objects(word).k ++ Merge(word, related) end if end if next next
To the end of the cycle we will have with of the type: { (String
Object, String() Related, Int K) }
Phase 2: "Analysis and statistical":
It calculates average (m) and variance (s) of the K, "categories "
are defined all the objects that they have
k > = m+s
More info:
Powered by Ing. Paolo Cavone |