Identify and Fix 404 Errors
It’s one thing to have external websites linking to pages that don’t exist than it is to serve your own users and web crawlers broken URLs. For visitors, a 404 error when clicking on a link could be compared to order a menu item at a restaurant and having the server tell you that they are all out of your item. Annoying, right?
Search engines hate 404 errors as much as visitors. In many cases, as Author Bill Hunt mentioned at SMX in June 2014, search engines will actually stop crawling XML sitemaps: “Bing has stated that if more than 1% of pages submitted have errors, they will stop crawling the URLs in the XML sitemap.”
How to Identify 404 Errors
Google Webmaster Tools is a great place to start when looking for 404 errors both on and off your website. Simply navigate to Google Webmaster Tools, select your website and click Crawl Errors in the left navigation (as shown below).
If you’re really on your game, you’ll setup a browser shortcut and schedule weekly reviews of all errors in Webmaster Tools.
Other Broken Link Checkers
We wouldn’t normally use the term broken link checker, but apparently you folks search with that phrase, so we’re using it here, in addition to semantic variations such as broken link finder, broken link scanner and possibly even the best broken link checker. Sorry, had to have some fun since this website is all about SEO.
We prefer following paid tools, however you search for them, in order of preference:
Free tools that timeout frequently on large websites include Xenu Link Sleuth and LinkChecker.
Common Causes of Broken Links
Out of Stock Products
The number one cause of broken links from our experience involves ecommerce websites and products “turned off” when out of stock. Better ways to handle out of stock items would be to simply offer back-order, email alerts, or redirect to the next closest product with a courtesy message about the item being out of stock.
However, many webmasters simply 404 the page, completely forgetting that the URLs are often included in HTML and XML sitemaps, and sometimes even have links from external resources. In the SEO world, links are like pipes passing gold into our pockets. To 404 a page with links to it would be like telling the search engines “sorry, we don’t want your gold”. Seriously, why would you ever do that?
Website Upgrades
Nothing takes traffic away quicker than a lazy webmaster not willing to take the time to redirect old pages to their new respective URLs.
Sorry folks, but redirecting old URLs to the homepage could be compared to cutting the link pipeline we mentioned above. Do the right thing: put your old URLs into a spreadsheet, match them to the new URLs and create your permanent (301) redirects from old URL to new URL (get help). Your visitors and web crawlers will thank you later.
Moving Pages Around
There are a million reasons you might want to move a page. For example, a blog post becomes a reference guide and you decide to convert it from a post to an evergreen content page, or perhaps you realize that you have two very similar pages and decide to combine them into a new page.
Whatever the reason, pages move and it’s perfectly normal. What’s not normal, is moving the page and not creating a 301 redirect. In fact, it’s borderline ignorant when you consider the gold analogy above.
How to Prevent and Fix Broken Links
Step One: Prevent Broken Links
In the web design world, we live off the cuff, and don’t take kindly to protocol or approval processes. Get over it. Create procedures for handling any scenario where a URL might change where the end result is the best possible result for a user clicking on the URL from a Google search.
Step Two: Identify Broken Links
Use any or all of the tools mentioned above to identify all the 404 errors on your website. You might find that most 404 errors have a pattern that can be fixed with one simple line of code in your .htaccess file.
List remaining 404 errors in a spreadsheet (we use Google Sheets to team up and tackle issues), sort alphabetically and assign to team members accordingly.
Step Three: Fix 404 Errors and Resubmit XML Sitemap
Fixing broken links is usually as simple as updating a link on a page to the best new page, or in some circumstances, removing the link altogether. In many cases, it may require consult with your webmaster or programming team.
Once all 404 errors have been resolved, update your HTML and XML sitemaps (most content management systems do this automatically for you). When done, use the broken link checker of your choice to verify all issues have been resolved.
If everything looks good, navigate to both Bing and Google Webmaster Tools and resubmit your XML sitemap. Example below:
As a reminder, take the time to schedule weekly reviews of crawl errors, security issues, and other data provided in Webmaster Tools; a simple 5 minute weekly routine won’t kill your schedule, I promise.