Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have been reading in many SQL books and articles that selectivity is an important factor in creating index. If a column has low selectivity, an index seek does more harm that good. But none of the articles explain why. Can anybody explain why it is so, or provide a link to a relevant article?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
245 views
Welcome To Ask or Share your Answers For Others

1 Answer

From SimpleTalk article by Robert Sheldon: 14 SQL Server Indexing Questions You Were Too Shy To Ask

The ratio of unique values within a key column is referred to as index selectivity. The more unique the values, the higher the selectivity, which means that a unique index has the highest possible selectivity. The query engine loves highly selective key columns, especially if those columns are referenced in the WHERE clause of your frequently run queries. The higher the selectivity, the faster the query engine can reduce the size of the result set. The flipside, of course, is that a column with relatively few unique values is seldom a good candidate to be indexed.

Also check these articles:

From the SqlServerCentral article:

In general, a nonclustered index should be selective. That is, the values in the column should be fairly unique and queries that filter on it should return small portions of the table.

The reason for this is that key/RID lookups are expensive operations and if a nonclustered index is to be used to evaluate a query it needs to be covering or sufficiently selective that the costs of the lookups aren’t deemed to be too high.

If SQL considers the index (or the subset of the index keys that the query would be seeking on) insufficiently selective then it is very likely that the index will be ignored and the query executed as a clustered index (table) scan.

It is important to note that this does not just apply to the leading column. There are scenarios where a very unselective column can be used as the leading column, with the other columns in the index making it selective enough to be used.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...