It’s a good question.
A page could be missing, and then come back later. Technically, if you really want to signal that this page is completely gone and will never come back, there’s an HTTP status code called 410. But at least last time we checked back in 2007, we actually treated those in the same way.
But to get to the meat of your question, why does it takes so long, the answer is webmasters can do kind of interesting things. And we sometimes see webmasters shoot themselves in the foot. Like, they’ll completely remove their site from the search results. Or they’ll be down and returning 404s instead of something like a 503 that says come back later.
And so, rather than learn very quickly, this is a 404, make it drop out forever, usually you prefer to build in a little bit more leeway there, so that if a webmaster is making a mistake, you can check a few times and make sure that it really is gone before you drop it out of the index. Now, it’s always tricky because if you get it wrong one way, people are unhappy. If you get it wrong the other way, people are unhappy.
Based on the complaints that we hear, what people are happy, what they’re sad about, to try to sort of find, maybe we’ll try this page a few more times and make sure that it’s really gone. And otherwise, you would hate it if you had a temporary glitch with your web server, and then Google didn’t come back and check on that web server for like three years or something like that. So it is the sort of thing where we try to find a good balance there. Thanks for the feedback, though. I can always talk to the crawl team and find out, do the 410s really make things go away faster now? Or are they still treated the same?
But at least for the time being, we try to build in that safety margin so that if webmasters do make a mistake, if their server’s overloaded, if their web host configured something incorrectly, it won’t sabotage, it won’t cause long term damage, and there will be a way to recover.
by Matt Cutts - Google's Head of Search Quality Team