SteveD:

Gary Nicholass

Google have archived pages that, for various reasons valid to myself and
others, I have removed from websites. That is my decision, taken according
to my morality, the opinions of others, or due to changing circumstances.
More to the point I own the domain names and all the content therein. What
right have Google to cache and serve pages which I have deleted?

I accept that anything I say on a ng will be recorded, and act accordingly.
I absolutely do not agree with Google cacheing and providing from their db
something that I have written and subsequently deleted, for whatever reason,
from the sites that I maintain.

Can’t see the issue.

I can.

My posts on usenet, and my entries and articles on
Aquarionics.com come under an implied (but soon – in the case of the
website – explicit) licence to be quoted or reproduced wholesale,
provided both context and attribution are provided and no commercial
worth is given to the post in particular. I retain copyright on the
items, yet grant the world at large permission to quote and use my
works, on those provisos. This is the implied statement you give
whenever you post anything to usenet, (the stuff about “Commercial
Worth” allows people to sell NNTP services without ownership getting
in the way, without someone being able to collect all my posts – for
example – and sell them as a book whilst gathering royalties. That I
could get moderatly intense about) (Note: The contents of this post
do not imply a legal statement, it wasn’t designed as such and
shouldn’t be used as such. If you want a watertight legal document,
ask a lawyer).

I don’t really have a problem with Google caching all that,
because people are free to quote it. I’d prefer them to read the original
source, because it will have any corrections or removals I may have
made, but I assume that if they are looking at Google’s cache, this
isn’t possible.

On the other hand, I write short stories (and currently novel,
but that’s another point) which it would be nice to get published some day.
A couple have already been so (Well, Fan/E-Zine published, which doesn’t
pay as well, but is still very nice), but one day it might be legally
necessary for me to take them down from the site, and at that point I
run into problems with the Google Cache, because it’s now costing
someone money, in that people who would buy the thing to read the
stories are instead reading them on Google.

The two sides of this could be thought of as follows:,
Neal Stephenson (of Snow Crash, Zodiac and Cryptonomicon) wrote a novel
called “The Big U” (I belive it was his first, but could be wrong) which
hadn’t been reprinted in years, and showed no signs of ever being so.

This being so, Neal didn’t object too much when copies of this
started appearing on the net. Then the sucess of Cryptonomicon (Which
has a sequel ‘Quicksilver’ out this summer) brought The Big U back into
print, so sites were asked to take it down.

On the other hand, Cory Doctorow published his first novel,
‘Down and Out in the Magic Kingdom’ simultaniously online (via his
website, http://www.craphound.com/down/) and via Tor Books (Get it,
by the way, it’s very good). This he did because first authors don’t
tend to get a lot of publicity, and this way many people would hear of
and read the book (and hopefully enjoy it) because:

  1. It’s free from the website
  2. People might see discussions about this odd publishing method
  3. Quarter of a million people a month read his weblog at
  4. http://www.boingboing.net, and his the novel has been widly
    and positivly recieved. That’s an awful lot of word of mouth.

SteveD:

You put information into the public domain [1].
On purpose. Someone made
a copy of that information and will show it to people who ask. If you
don’t want the information to go public, don’t put it in the public
domain in the first place. Google is not the only webcrawler in the
world, nor the only archiver. I’d suggest researching the actions of the
Wayback Machine and thinking about how many other people do this kind of
thing as a hobby.

Beware the terminology. “In the public domain” and “readable by the
public” are differant things. You can see Mickey Mouse, watch his films,
but try to use him unless you work for Disney and you’ll find yourself
in la-lawyer land.

SteveD:

If you merely don’t want the reputable archivers to record your
information, I suggest you look into the use of a file called
ROBOTS.TXT.

Personally, I would like Google to have a seperate useragent
string for the archiving as apposed to the crawling, so I can tell it
to index the site for searching, but exclude certian bits from
caching. When I get back online properly, I’ll email them about this.

To sum up my entire post, I make this point: Just because it’s
on a website, doesn’t mean it isn’t copyrighted. If it’s copyrighted, you
should get permission before you use it.