Internal Links are hyperlinks that point at (target) the same domain as the domain that the link exists on (source). In layman’s terms, an internal link is one that points to another page on the same website.
<a href="http://www.same-domain.com/" title="Keyword Text">Keyword Text</a>
Use descriptive keywords in anchor text that give a sense of the topic or keywords the source page is trying to target.
What is an Internal Link?
Internal links are links that go from one page on a domain to a different page on the same domain. They are commonly used in main navigation.
These type of links are useful for three reasons:
- They allow users to navigate a website.
- They help establish information hierarchy for the given website.
- They help spread link equity (ranking power) around websites.
SEO Best Practice
Internal links are most useful for establishing site architecture and spreading link equity (URLs are also essential). For this reason, this section is about building an SEO-friendly site architecture with internal links.
On an individual page, search engines need to see content in order to list pages in their massive keyword–based indices. They also need to have access to a crawlable link structure—a structure that lets spiders browse the pathways of a website—in order to find all of the pages on a website. (To get a peek into what your site’s link structure looks like, try running your site through Open Site Explorer.) Hundreds of thousands of sites make the critical mistake of hiding or burying their main link navigation in ways that search engines cannot access. This hinders their ability to get pages listed in the search engines’ indices. Below is an illustration of how this problem can happen:
In the example above, Google’s colorful spider has reached page “A” and sees internal links to pages “B” and “E.” However important pages C and D might be to the site, the spider has no way to reach them—or even know they exist—because no direct, crawlable links point to those pages. As far as Google is concerned, these pages basically don’t exist–great content, good keyword targeting, and smart marketing don’t make any difference at all if the spiders can’t reach those pages in the first place.
The optimal structure for a website would look similar to a pyramid (where the big dot on the top is homepage):
Below are some common reasons why pages might not be reachable, and thus, may not be indexed.
Links in Submission-Required Forms
Forms can include elements as basic as a drop–down menu or elements as complex as a full–blown survey. In either case, search spiders will not attempt to “submit” forms and thus, any content or links that would be accessible via a form are invisible to the engines.
Links Only Accessible Through Internal Search Boxes
Spiders will not attempt to perform searches to find content, and thus, it’s estimated that millions of pages are hidden behind completely inaccessible internal search box walls.
Links in Flash, Java, or Other Plug-Ins
Any links embedded inside Flash, Java applets, and other plug-ins are usually inaccessible to search engines.
Links Pointing to Pages Blocked by the Meta Robots Tag or Robots.txt
The Meta Robots tag and the robots.txt file both allow a site owner to restrict spider access to a page.
Links on pages with Hundreds or Thousands of Links
The search engines all have a rough crawl limit of 150 links per page before they may stop spidering additional pages linked to from the original page. This limit is somewhat flexible, and particularly important pages may have upwards of 200 or even 250 links followed, but in general practice, it’s wise to limit the number of links on any given page to 150 or risk losing the ability to have additional pages crawled.
Links in Frames or I-Frames
Technically, links in both frames and I-Frames are crawlable, but both present structural issues for the engines in terms of organization and following. Only advanced users with a good technical understanding of how search engines index and follow links in frames should use these elements in combination with internal linking.
By avoiding these pitfalls, a webmaster can have clean, spiderable HTML links that will allow the spiders easy access to their content pages. Links can have additional attributes applied to them, but the engines ignore nearly all of these, with the important exception of the
Want to get a quick glimpse into your site’s indexation? Use a tool like Open Site Explorer, or Screaming Frog to run a site crawl. Then, compare the number of pages the crawl turned up to the number of pages listed when you run a site:search on Google.
Rel=”nofollow” can be used with the following syntax:
<a href="/" rel="nofollow">nofollow this link</a>
In this example, by adding the
rel="nofollow" attribute to the link tag, the webmaster is telling the search engines that they do not want this link to be interpreted as a normal, juice passing, “editorial vote.” Nofollow came about as a method to help stop automated blog comment, guestbook, and link injection spam, but has morphed over time into a way of telling the engines to discount any link value that would ordinarily be passed. Links tagged with nofollow are interpreted slightly differently by each of the engines.