Hope i am not repeating this question. I did some search here and google before posting here.
I am running a eStore with SQL Server 2008R2 with Full Text enabled.
My requirements,
- There is a Product Table, which has product name, OEM Codes, Model which this product fits into. All are in text.
- I have created a new column called TextSearch. This has concatenated values of Product Name, OEM Code and Model which this product fits in. These values are comma separated.
- When a customer enters a keyword, we run search on TextSearch column to match for products. See matching logic below.
I am using a Hybrid Fulltext and normal like to do search. This gives more relevant results. All the queries executed in to a temp table and distincts were returned.
Matching logic,
Run following SQL to get relevant product using full text. But @Keywords will be pre-processed. Say 'CLC 2200' will be changed to 'CLC* AND 2200*'
SELECT Id FROM dbo.Product WHERE CONTAINS (TextSearch ,@Keywords)
Another query will be running using normal like. So 'CLC 2200' will be pre-processed to 'TextSearch like %clc% AND TextSearch like %2200%'. This is simply because full text search wont search patterns before the keywords. example, it wont return 'pclc 2200'.
SELECT Id FROM dbo.Product WHERE TextSearch like '%clc%' AND TextSearch like '%2200%'
If step 1 and 2 didn't return any records, following search will be executed. Value 135 was fine tuned by me to return more relevant records.
SELECT p.id FROM dbo.Product AS p INNER JOIN FREETEXTTABLE(product,TextSearch,@Keywords) AS r ON p.Id = r.[KEY] WHERE r.RANK > 135
All of above combined works fine in a reasonable speed and returns relevant products for keywords.
But i am looking for to further improve when there is no product found at all.
Say if customer looks for 'CLC 2200npk' and this product wasn't there, i needed to show next very close by 'CLC 2200'.
So far i tried using Soundex() function. Buy computing soundex value for each word in TextSearch column and comparing with soudex value of keyword. But this returns way too many records and slow too.
example, 'CLC 2200npk' will return products such as 'CLC 1100' etc. But this wouldn't be a good result. As it is not close to CLC 2200npk
There is another good one here. but this uses CLR Functions. But i can not install CLR functions on the server.
So my logic would need to be,
if 'CLC 2200npk' not found, show close by 'CLC 2200' if 'CLC 2200' not found, show next close by 'CLC 1100'
Questions
- Is it possible to match like as suggested?
- If i would need to do spelling correction and search, what would be good way? All of our product listing is in English.
- Is there any UDF or SP's to match texts like my suggestions?
Thanks.
See Question&Answers more detail:os