Tuesday, October 22, 2013

Character Definitions for htaccess

#
the # instructs the server to ignore the line. used for including comments. each line of comments requires it’s own #. when including comments, it is good practice to use only letters, numbers, dashes, and underscores. this practice will help eliminate/avoid potential server parsing errors.

[F]
Forbidden: instructs the server to return a 403 Forbidden to the client.

[L]
Last rule: instructs the server to stop rewriting after the preceding directive is processed.

[N]
Next: instructs Apache to rerun the rewrite rule until all rewriting directives have been achieved.

[G]
Gone: instructs the server to deliver Gone (no longer exists) status message.

[P]
Proxy: instructs server to handle requests by mod_proxy

[C]
Chain: instructs server to chain the current rule with the previous rule.

[R]
Redirect: instructs Apache to issue a redirect, causing the browser to request the rewritten/modified URL.

[NC]
No Case: defines any associated argument as case-insensitive. i.e., "NC" = "No Case".

[PT]
Pass Through: instructs mod_rewrite to pass the rewritten URL back to Apache for further processing.

[OR]
Or: specifies a logical "or" that ties two expressions together such that either one proving true will cause the associated rule to be applied.

[NE]
No Escape: instructs the server to parse output without escaping characters.

[NS]
No Subrequest: instructs the server to skip the directive if internal sub-request.

[QSA]
Append Query String: directs server to add the query string to the end of the expression (URL).

[S=x]
Skip: instructs the server to skip the next "x" number of rules if a match is detected.

[E=variable:value]
Environmental Variable: instructs the server to set the environmental variable "variable" to "value".

[T=MIME-type]
Mime Type: declares the mime type of the target resource.

[]
specifies a character class, in which any character within the brackets will be a match. e.g., [xyz] will match either an x, y, or z.

[]+
character class in which any combination of items within the brackets will be a match. e.g., [xyz]+ will match any number of x’s, y’s, z’s, or any combination of these characters.

[^]
specifies not within a character class. e.g., [^xyz] will match any character that is neither x, y, nor z.

[a-z]
a dash (-) between two characters within a character class ([]) denotes the range of characters between them. e.g., [a-zA-Z] matches all lowercase and uppercase letters from a to z.

a{n}
specifies an exact number, n, of the preceding character. e.g., x{3} matches exactly three x’s.

a{n,}
specifies n or more of the preceding character. e.g., x{3,} matches three or more x’s.

a{n,m}
specifies a range of numbers, between n and m, of the preceding character. e.g., x{3,7} matches three, four, five, six, or seven x’s.

()
used to group characters together, thereby considering them as a single unit. e.g., (perishable)?press will match press, with or without the perishable prefix.

^
denotes the beginning of a regex (regex = regular expression) test string. i.e., begin argument with the proceeding character.

$
denotes the end of a regex (regex = regular expression) test string. i.e., end argument with the previous character.

?
declares as optional the preceding character. e.g., monzas? will match monza or monzas, while mon(za)? will match either mon or monza. i.e., x? matches zero or one of x.

!
declares negation. e.g., “!string” matches everything except “string”.

.
a dot (or period) indicates any single arbitrary character.

-
instructs “not to” rewrite the URL, as in “...domain.com.* - [F]”.

+
matches one or more of the preceding character. e.g., G+ matches one or more G’s, while "+" will match one or more characters of any kind.

*
matches zero or more of the preceding character. e.g., use “.*” as a wildcard.

|
declares a logical “or” operator. for example, (x|y) matches x or y.

\
escapes special characters ( ^ $ ! . * | ). e.g., use “\.” to indicate/escape a literal dot.

\.
indicates a literal dot (escaped).

/*
zero or more slashes.

.*
zero or more arbitrary characters.

^$
defines an empty string.

^.*$
defines one character that is neither a slash nor a dot.

[^/.]+
defines any number of characters which contains neither slash nor dot.

http://
this is a literal statement — in this case, the literal character string, “http://”.

^domain.*
defines a string that begins with the term “domain”, which then may be proceeded by any number of any characters.

^domain\.com$
defines the exact string “domain.com

-d
tests if string is an existing directory

-f
tests if string is an existing file

-s
tests if file in test string has a non-zero value

Redirection Header Codes
  • 301 – Moved Permanently
  • 302 – Moved Temporarily
  • 403 – Forbidden
  • 404 – Not Found
  • 410 – Gone

Tags : Character Definitions for htaccess, nilesh patelseo services providerfans of photography,  nilesh patel seo

Monday, October 21, 2013

How to do a 301 Redirect Properly on Apache or Windows Servers

In today’s blog post we will discuss how to implement a proper 301 redirect on Apache or Windows servers from one domain to another. Redirects are technical and we see a lot of sites where 301 redirects are not implemented properly. You might want to do 301 redirects because of a number of reasons: redirecting the non www to www and vice versa, or if you are changing your domain or a file within the same domain.  This is also a great post on ways you can fix your 404 error pages.

Before we enter the technical details, it is important to understand the importance of a 301 redirect from non www to www version of your site (or vice versa). First having two versions of your site can create duplicate content, which may result in your website being penalized by search engines. Secondly and most importantly, when you acquire links it’s always much better to have them pointing at one version of the site versus distributing it among two pages which dilutes the search engine importance to your domain.

301 redirects is the most preferred way of handling duplicate content. Other ways include using the ” rel = canonical” tag (don’t use for cross domain, Yahoo/ Bing still don’t recognize it), blocking files in robots.txt and the meta noindex tag.

Let’s dive into the technical details:

How to do a 301 redirect for an Apache server:

Step 1 : To implement a 301 redirect the file we need to work with is the .htaccess file. To access the file you need to go into your FTP and look into the document root.

Step 2 : If you can’t see it, enable viewing of hidden files since the .htacess file is hidden. If there is still no .htaccess file present , create one with a simple text editor.

Step 3 : Insert this code in the file:

Code example from non www to www:

RewriteEngine On
RewriteCond %{HTTP_HOST} !^www.example.com$ [NC]
[NC] RewriteRule .? http://www.example.com%{REQUEST_URI} [R=301,L]

Obviously you will need to replace ‘example’ with your own domain name.

Also make sure the Rewrite Engine is turned on, you will just need to turn it on once.

Step 4 : Save and Test it!

How to do a 301 redirect for a Windows server:

When setting up a site in IIS, the normal process is to create one account for the site and add both www and non-www versions of the domain name to the host headers for the account. This creates a canonicalization issue;  and the site will then be available at both www and non-www URLs.

Step 1 : Get access to the Windows Server navigation Panel. Navigate your way to the Internet Services Manager (Programs — Administrative Tools — Internet Services Manager).

Step 2 : Create 2 accounts for the site within IIS: one with the www version of the domain in the host header and one with the non-www version of the domain. All of the site files can be placed in the preferred version and a single page in the other.

Step 3 : Right click on the single page you want to redirect FROM and choose Properties. The Properties box will now appear.

Step 4 : Change the redirect option to “A redirection to a URL” and type in the new URL in the box provided.

Step 5 : Be sure to check the box marked “A permanent redirection for this resource”. If you leave this box unchecked, you will create a 302 (temporary) redirect, which is not permanent or beneficial from an SEO standpoint in this situation.

Step 6 : Test it!

Doing a www redirect for Front Page

As I see some of the comments below pertaining to Front Page, it was a matter of time before I had to do one for this God-forsaken MS product myself.  Here’s how I did after some trial and error:

1.  First, you have to identify weather you are running Linux or Windows.  This works for Linux.  Apparently, there is an extension called FollowSymlinks which needs to be turned on, as well as Mod Rewrites, so call your host provider for that one.

2.  FTP uses several .htaccess files – one in the main directory structure, and 3 other .htaccess files called “super files”.  You will find these other .htaccess files here:

/_vt_bin/.htaccess
/_vt_bin/_vti_aut/.htaccess
/_vt_bin/_vti_adm/.htaccess

3.  Make sure this is at the top of all 4 .htaccess files: “Options +FollowSymlinks” underneath “# -FrontPage-”

4.  Underneath this, add your 301 redirect command:

RewriteEngine On 
RewriteCond %{HTTP_HOST} ^yoursite.com$ [NC]
RewriteRule ^(.*)$ http://www.yoursite.com/$1 [R=301,L]

Here, I did a 301 from non-www to the www, because for SEO purposes, most people have more inbound links pointing to the www version.

That’s is – this should work!


Tags : How to do a 301 Redirect Properly on Apache, How to do a 301 Redirect Properly on Windows Servers,  non www to www, redirecting on www, non www to www with .htaccess,  nilesh patelseo services providerbest information of the world,  nilesh patel forum,  fans of photography,  nilesh patel seo

Redirecting non-www to www with .htaccess

If you want to redirect all non-www requests to your site to the www version, all you need to do is add the following code to your .htaccess file:
RewriteEngine On RewriteCond %{HTTP_HOST} !^www\. RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
This will redirect any requests to http://my-domain.com to http://www.my-domain.com. There are several benefits from doing that:

This will redirect any requests to http://my-domain.com to http://www.my-domain.com. There are several benefits from doing that:
  • It will avoid duplicate content in Google
  • It will avoid the possibility of split page rank and/or split link popularity (inbound links).
  • It's nicer, and more consistent.
Note that if your site has already been indexed by Google without the www, this might cause unwanted side effects, like lost of PR. I don't think this would happen, or in any case it would be a temporary issue (we are doing a permanent redirect, 301, so Google should transfer all rankings to the www version). But anyway, use at your own risk!

Something nice about the code above is that you can use it for any website, since it doesn't include the actual domain name.

Redirecting www to non-www

If you want to do the opposite, the code is very similar:
RewriteEngine On RewriteCond %{HTTP_HOST} !^my-domain\.com$ [NC] RewriteRule ^(.*)$ http://my-domain.com/$1 [R=301,L]
In this case we are explicitly typing the domain name. I'm sure it's possible to do it in a generic way, but I haven't had the time to work one out and test it. So remember to change 'my-domain' with your domain name!

In this case we are explicitly typing the domain name. I'm sure it's possible to do it in a generic way, but I haven't had the time to work one out and test it. So remember to change 'my-domain' with your domain name!


Tags : non www to www, redirecting on www, non www to www with .htaccess,  nilesh patelseo services providerbest information of the world,  nilesh patel forum,  fans of photography,  nilesh patel seo

How Google is Changing Long-Tail Search with Efforts Like Hummingbird

The Hummingbird update was different from the major algorithm updates like Penguin and Panda, revising core aspects of how Google understands what it finds on the pages it crawls. In today's Whiteboard Friday, Rand explains what effect that has on long-tail searches, and how those continue to evolve.


For reference, here's a still of this week's whiteboard!


Video Transcription

Howdy, Moz fans and welcome to another edition of Whiteboard Friday. This week I wanted to talk a little bit about Google Hummingbird slightly, but more broadly how Google has been making many efforts over the years to change how they deal with long-tail search.

Now long tail, if you're not familiar already, is those queries that are usually lengthier in terms of number of words in the phrase and refer to more specific kinds of queries than the sort of head of the demand curve, which would be shorter queries, many more people performing them, and, generally speaking, the ones that in our profession, especially in the SEO world, the ones that we tend to care about. So those are the shorter phrases, the head of the demand curve, or the chunky middle of the demand curve versus the long tail.

Long tail, as Google has often mentioned, is a very big proportion of the Web search traffic. It's anywhere from 20% to maybe 40% or even 50% of all the queries on the Web are in that long tail, sort of fewer than maybe 10 to 50 searches per month, in that bucket. Somewhere around 18% or 20% of all searches Google says are extremely long tail, meaning they've never seen them before, extremely unique kinds of searches.

I think Google struggles with this a little bit. They struggle from an advertising perspective because they'd like to be able to serve up great ads targeting those long-tail phrases, but inside of AdWords, Google's Keyword Tool, for self-service advertising, it's tough to choose those. Google doesn't often show volume around them. Google themselves might have a tough time figuring out, "hey, is this query relevant to these types of results," especially if it's in a long tail.

So we've seen them get more and more sophisticated with content, context, and textual analysis over the years such that today, with the release of, in August according to Google, Hummingbird, which was an infrastructure update more so than an algorithmic update. You can think of Penguin or Panda as being algorithmic style updates, and Google Caffeine, which upgraded their speed, or Hummingbird, which they say upgrades their text processing and their content and context understanding mechanisms is affecting things today.

I'll try and illustrate this with an example. Let's say Google gets two search queries, "best restaurants SEA," Seattle's airport, that's the airport code, the three-letter code, and "where to eat at Sea-Tac Airport in Terminal C." Let's say then that we've got a page here that's been produced by someone who has listed the best restaurants at Sea-Tac, and they've ordered them by terminals.

So if you're in Terminal A, Terminal B, Terminal C, it's actually easy to walk between most of them except for N and S. I hope you never have to go N. It's just a pain. S is even more of a pain. But in Terminal C, which I assume would be Beecher's Cheese, because that place is incredible. It just opened. It's super good. In Terminal C, they've got a Beecher's Cheese, so they've got a listing for this.

A smart Google, an intelligent engineer at Google would go, "Man, you know, I'd really like to be able to serve up this page for this result. But it doesn't target the words 'where to eat' or 'Terminal C' specifically, especially not in the title or the headline, the page title. How am I going to figure that out?" Well, with upgrades like what we've seen with Hummingbird, Google may be able to do more of this. So they essentially say, "I want to understand that this page can satisfy both of these kinds of results."

This has some implications for the SEO world. On top of this, we're also getting kind of biased away from long-tail search, because keyword (not provided) means it's harder for an individual marketer to say: "Oh, are people searching for this? Are people searching for that? Is this bringing me traffic? Maybe I can optimize my page more towards it, optimize my content for it."

So this kind of combination and this direction that we're feeling from Google has a few impacts. Those include more traffic opportunities, opportunities for great content that isn't necessarily doing a fantastic job at specific keyword targeting.

So this is kind of interesting from an SEO perspective, because we're not saying, and I'm definitely not saying, stop doing keyword targeting, stop putting good keywords in your titles and making your pages contextually relevant to search queries. But I am saying if you do a good job of targeting this, best restaurants at SEA or best restaurants Sea-Tac, you might find yourself getting a lot more traffic for things like this. So there's almost an increased benefit to producing that great content around this and serving, satisfying a number of needs that a search query's intent might have.

Unfortunately, for some of us in the SEO world, it could get rougher for sites that are targeting a lot of mid and long-tail queries through keyword targeting that aren't necessarily doing a fantastic job from a content perspective or from other algorithmic inputs. So if it's the case that I just have to be ranking for a lot of long-tail phrases like this, but I don't have a lot of the brand signals, link signals, social signals, user usage signals, I just have strong keyword signals, well, Google might be trying to say, "Hey, strong keyword signals doesn't mean as much to us anymore because now we can take pages that we previously couldn't connect to that query and connect them up."

In general, what we're talking about is Google rewarding better content over more content, and that's kind of the way that things are trending in the SEO world today.

So I'm sure there's going to be some great discussion. I really appreciate the input of people who have done extensive analysis on top of Hummingbird. Those folks include folks like Dr. Pete, of course, from Moz, Bill Slawski from SEO by the Sea, Ammon Johns, who wrote a great post about this. I think there'll be more great discussion in the comments. I look forward to joining you there. Take care.



Thursday, October 17, 2013

How Google’s Disavow Links Tool Can Remove Penalties

Can using Google’s link disavow tool help remove penalties? Yes, the company says. But when it comes to manual penalties, disavowing links alone isn’t enough. With algorithmic penalties, there may be a time delay involved. Below, more about how both methods work.

Over the past few days, I’ve encountered a couple of cases where people are confused about how the link disavow tool works to remove penalties. So, I figured a clarification post was in order. Here’s the situation, all of which I reverified with Google yesterday.

Disavowing Links: “Don’t Count These Votes!”

If you submit a disavow request, Google will automatically process that request and tag those links pointing at your site in the same manner as if they had the nofollow tag on them, in other words, as if they aren’t actually pointing at your site for link counting and analysis purposes.


This is something that came up again in a Google Webmaster Central hangout video yesterday:


In short, if links are votes, using the link disavow tool effectively tells Google that you don’t want any of those votes counted, for better or worse, toward your rankings.

This all happens automatically, and Google says it still takes several weeks until the disavow request is processed.

Removing Algorithmic Penalties

Now let’s take a situation where you’re hit by an algorithmic penalty related to links, such as the Penguin Update. “Algorithmic” means an automatic penalty, one that involves no human review at all. Rather, Google’s computers have ruled that your site has done something wrong.

To remove that penalty, you need to clean up your links. That’s where link disavow can help. Let’s assume you use it to correctly disavow bad links that were hurting you.

That’s step one, cleaning up the links. Step two is waiting for the disavow request to get processed. That, as I’ve said, may take several weeks.

Step three is that you have to wait until the next time Google runs your site against whatever part of its algorithm hit you. For many, that means Penguin. Even if you’ve cleaned up your links with disavow, you have to wait until the Penguin Update is run again before you’ll see an impact.

For example, let’s assume you were hit by Penguin 3 last October. You used the link disavow tool to clean up your links soon after that. You still have to wait until Penguin 4 happens before you should see a change (and Google has said that more Penguin updates haven’t yet happened).

Now take the same situation, where you file the disavow request just a few days before a Penguin Update. Even though the request went ahead of the update, you still might not get cleared because by the time it’s processed (several weeks), the latest update will have happened. You’ll have to wait for the one after that.

Eventually, if you’ve used the tool, you should see a change. It’ll just take time. But if it was an algorithmic penalty, then it should automatically clear if you file disavow alone (or clean up your link profile in other ways).

Removing Manual Penalties

The situation is different — and potentially much faster — if you were hit by a manual penalty. That’s when some human being at Google has reviewed your site and decided that it deserves a penalty. In virtually all of these cases, it also means you would have received a notice from Google that this has happened.

If the penalty involves bad links, the link disavow tool can help you disown those. However, the penalty won’t automatically be removed because it was placed manually. You have to also file a reconsideration request. This will prompt a human being at Google to check on your site. They can see that the link disavow request has been filed, and if that’s enough, then the manual penalty may get lifted.

You have to do both: disavow links and file a reconsideration request, which Google has said before. And really, you have to do a third thing, which is make a good faith effort to remove links beyond just using link disavow, which Google has also said before (see our Q&A With Google’s Matt Cutts On How To Use The Link Disavow Tool for more about this).

There is one caveat to the above. Manual penalties have expirations dates, Google reminds. This means after a period of time, perhaps a few weeks or a few months, the penalty against your site should expire naturally. That’s why you might see an improvement even if you do nothing. (But note from the comments below, some penalties can go on for two or three years before they expire).

Doing nothing, however, may leave you subject to an algorithmic penalty in the future. In short, if you get a manual penalty, take that as a solid warning you need to fix something, lest you face a longer-term algorithmic penalty in the future.

Google’s Matt Cutts: Guest Blogging Best Done In Moderation

In his latest video, Google’s head of search spam Matt Cutts answers the question, “How can I guest blog without it appearing as if I paid for links?”

According to Cutts, when his team reviews spam reports, there is usually a clear distinction between organic guest blog content and someone who is paying for links.

Cutts identified specific differences between spam and organic guest blogging content, confirming the spam content doesn’t match the subject of the blog itself and will contain keyword rich anchor text.

A true guest blogger, said Cutts, is usually someone who is an expert on the subject matter and doesn’t drop a heavy amount of keywords in their anchor text.

“Guest blogging seems like it’s the fad of the month,” said Cutts, but he says it is best done in moderation. “It shouldn’t be your full-time job.”

He emphasizes that if guest blogging is all you are doing to drive traffic to your site, you’re probably not doing any favors for your site’s reputation.