Google, user interests, and biasing factors
by Michael L. Love, Ph.D
Sat Feb 27 13:45:22 EST 2010
Is the ability of a site to drive traffic included in Google Page Rank, and
does it predominate over other factors, such as number of external links? If
so, is the factor of ability to drive traffic well correlated with the number
of external links? Ultimately, how does this correlate with users interests,
which is Google's stated aim. What factors are biasing Page Rank? What are the
other factors that Google is using now instead of Page Rank? The answer
to these questions resides in part in a simple logical exercise.
If page rank is determined or largely
influenced by the ability to drive traffic, it is the essence of
corporate bias, because the ability to drive traffic is immaterial to
the aims of the search engine end user, and it can only serve the aims
of advertisers and large corporations, like Google, guilty on both
counts. In this perspective, such bias seems inevitable in a scheme
that is economically driven. Google is known to have transcendent aims,
so that we should not assume this to be the case, but it is a compelling
case, and possibly a good model and fortunate for them as well.
The logical analysis will be followed by conclusions based in part on
that analysis, as well as on a small sampling of Page Rank data, and on a short
analysis of the ability of gnu-darwin.org and related sites to drive
traffic to other sites. None of the data are presented, and they are
left as an exercise for the reader.
correlated biased correlated biased
yes yes no no
no no yes
in the binary:
case 1, correlated and biased
This appears to be invalid in the binary, although it is consistent with
our presuppositions in the non-binary sense, where these are continuous
variables. That is to say, it is consistent with economic bias coupled with
a correlation with user interest. In the binary sense these are mutually
case 2, not correlated and biased
Valid options are often the most interesting.
Case 2 is interesting because it implies that Google
discovered that the ability to drive traffic was a better metric than
external linkage. Because they aim to be the best search engine for the
end user, one can only assume that there is a plausible argument that
the ability to drive traffic is an important metric of user interest.
In case 2 it turns out to be far more important and accurate than external
linkage. This makes a kind of sense. As we discovered with the molecules
site, the interests of webmasters do not necessarily coincide well with
the interests of search engine users. If there is a better metric than
external linkage, then this would be fortunate for users. The underbelly
question is whether the ability to drive traffic is a better metric
than other factors. Corporate bias may indicate the use of the ability
to drive traffic anyway. The users take a cut in the interest of corporate
profit, something that is known to happen from time to time ;-}.
case 3, correlated and not biased
This would indicate that the ability to drive traffic merely happens to be
consistent with the page rank in the observed cases. This may seem unlikely,
but it is not impossible.
case 4, not correlated and not biased
This would indicate that Google believes that it has found better metrics
than either of these, an interesting view which leads directly into further
Google's primary competitive advantage to me appears to be search-related
advertising. It is advantageous in both the monetary and informational
senses, because the ads are pushed directly to users with an apparent interest
in the topic. This has been noted to be useful in the informational sense
to users, who tend to receive ads that may be interesting to them and well
targeted. What is less often observed, is the informational advantage that
this gives to Google, which I am arguing is the more important proprietary
corporate advantage. Google sees what links and ads are clicked, so that they
have a direct metric of end-user interests. They only need a sampling, and this
is what they use. Case 4 is plausible, because they may have discovered a
better metric than either external links or the ability to drive traffic.
If so, then they are apparently keeping it to themselves and it is no
surprise that they keep it to themselves. It would be a tremendous
advantage for them.
Google has had sufficient time to make user click
data the primary metric, but do they? It is possibly arguable that the
ability to drive traffic is even more advantageous than user click data.
Advertisers would like to influence users to extend their interests and
act outside of the normal regimens in order to gain customers, the primary
aim of advertising. The secondary aim of advertising is to increase the
value of the brand, so that users interests in the brand are reinforced,
and they are channeled to it. How does user click data advance these aims?
The answer is that these aims are no metric of user interest, but of the
success of the advertising campaign. The ability to drive traffic would
be a better metric in this calculation, which explains the success
of advertisers that act outside of Google's channel. They only have to
identify the sites that drive traffic, and they can advertise on those
sites, eliminating Google as middle man. It appears that Amazon's Alexa
has provided implicit traffic driving factors to the public. If page rank is a metric of
the ability to drive traffic, then it is advantageous to Google's
competitors, in addition to Alexa. It is arguable that Google can model the abilty to drive
traffic to near perfection, because of their access to user click data from search and ads combined.
One wonders is Alexa can match that ability.
Would Google give a unique advantage to their competitors?
I think that the answer to that question is no, not only because of Google's
competitive corporate interests, but also because of their transcendent aims.
Before discussing the impact of the transcendent aims, it should be noted
that it is contrary to the economic model and highly unlikely that a
corporation would yield a unique proprietary advantage to its competitors, and
Google maintains its user behaviour model for internal use, but also to
improve the search experience. This is arguably related to their transcendent
aims. Being the best at something is not always consistent with economic
gain, at least in the short term. It has been widely observed that the best
does not always win, and it is a truism. Regardless of a transcendent aim,
we can simply argue that Google aims to be the best search engine for
some unknown reason, or because of the intelligent view that being the best
will win eventually. To ensure this, they obsess on their corporate user
experience model, by which they hope to accelerate their path to success.
This is the main reason why Google is so far ahead of its competitors. They
will create an excellent search experience for users, even sometimes when
it appears to be an economic disadvantage to do so. In this way, they
increase and maintain their market-share better than the others.
The conclusion appears to be that Google now ranks search results according
to raw user interest, which they have measured directly, and which they can
now model to near perfection in the statistical sense. This is consistent with
the fact that they are now trying to model individual interest, to varying
degrees of success. Page Rank then is an early model which reflects
user interest, but it possibly includes other factors as well. In fact, one
concludes in retrospect that the aim in designing Page Rank was solely to model
user interests in order to improve the user experience. This was in fact their
stated aim. Page Rank is less useful to Google now that they have direct data
on which to base a more accurate model. It appears to me that Page Rank may be
an important metric to users, who may like an indicator of the popularity
of a website, but perhaps less important for advertising competitors. These
are likely the reasons why Google made Page Rank public, and it is evidence
that they have obscolesced it. Google keeps the factor of ability to drive
traffic largely to themselves, as a proprietary advantage in their advertising
business, which is apparently reflected in the differing nuances between the
search results and ads. Obviously this is not the only factor biasing ad
Google is facing ever evolving competition, recently now in light of the
Yahoo/Microsoft Bing partnership, which creates a vast economically competent
competitive force. Moreover, specialty search engines
are constantly improving their appeal in niche and vertical markets, as well
as other segments of the economy. Such innovations certainly appeal to
Google's advertising competitors, who are dedicated soley to driving
traffic to their client sites. Of particular note is the social networking sites, such as
MySpace and LinkedIn, but especially Facebook, which has a highly evolved
engine for social network searches. In light of this competition, one wonders
if Google should better leverage their main strengths, and so avoid the temptation to
involve corporate bias in the presentation of ads. This strategy would be
expected to improve advertising success as well by focusing on what users
want to see, instead of an errant focus on what drives traffic.