Google search programs called "spiders" or "robots" etc., scan netblock after netblock for webservers, and then enumerate whatever they can on them with what is usually a balance between "gentle" configuration and "brute force enumeration" of the entire server.

These programs usually operate in a mostly predictable fashion, and even if they're rude and pushy the responsibility to secure this information from unwanted cataloguing lies SOLELY on the website owner. Google is only doing what they do best, DB on the other hand has the sole power to lock down the information.

I've offered solutions, they prefer a more open access model, I presume it will continue as is.