How should I configure If-Modified-Since on database-driven pages?

How should I configure If-Modified-Since on database-driven pages? - answered by Matt Cutts

Matt's answer:

Matt Cutts: Today’s question is from Tommo in London: Tommo wants to know, “Following your interview with Eric Enge, you mentioned about ‘If Modified-Since.’ We worked on many websites whereby the actual file timestamp doesn’t change but the content does, as the pages are database-driven. How should we deal with such situations?” Great question, very specific. So let’s dive in. So for people who don’t know, suppose you have a static file. You can say, “if modified since” and the http header is to say how, essentially how old this file is. If it hasn’t been modified since 2007, Google doesn’t have to keep fetching it all the time or we can just check whether it’s been updated. And if it hasn’t we can just reuse the copy that’s in our index. So “if modified since” can be really, really helpful to tell search engines and bots whether the content on a specific page has changed or not. So, Tommo’s question is the file, the template hasn’t changed, but we’re using a database to update a large chunk of the page, and what do you do with “if modified since” then? Well, take a step back, and say to yourself: “Okay, a search engine comes to a page. Has the page changed or not?” That’s sort of the litmus test. And not just like a tiny little bit of change, but like a substantial fraction of the page. In this case, since you’re using database-driven techniques to update a large fraction of the page, as far as users care and so as far search engines care, the page has changed. So in that situation, I would say if you do have the ability to control the “if modified since” header, I would update that. Because remember Google will look and it will see, “Oh the page hasn’t been modified, Maybe we don’t have to fetch that page again. Maybe we’ll just reuse the copy that we’d crawled last time.” So whenever you’re thinking about this with database driven sites, if you have dynamic content, content that’s changing each time, ,content that changes every few days, I would update the “if modified since” header if you can. And you can sort of keep it in sync with the content. If you can’t do that then worst-case you can remove the “if modified since” and then Google will not assume that the content is necessarily old or that it hasn’t changed. Instead it will be forced to {shh} to crawl that content to find out whether it really has changed. So kind of an esoteric, really specific question, but my answer is: if the page truly has changed and you are updating it via database or whatever, and that sounds like what’s happening here, then I would also change the “if modified since” header so that the search engine knows to fetch that page again.

by Matt Cutts - Google's Head of Search Quality Team


Original video: