Sanal.org
till 01th May 2006
Categories
.NET (40)
Art (1)
Blogs (1)
Computers (6)
Information (13)
Internet (11)
Movies (5)
Music (19)
MY SQL (5)
People (16)
Russian (20)
Security (15)
SEO (3)
SQL SERVER (8)
Technology (6)
Turkey (2)
Video (7)
More
TOP 50
Turkish News


Archive
April (2008)
March (2008)
February (2008)
January (2008)
December (2007)
Novamber (2007)
October (2007)
September (2007)
August (2007)
July (2007)
June (2007)
May (2007)
April (2007)
March (2007)
February (2007)
January (2007)
December (2006)
Novamber (2006)
October (2006)
September (2006)
August (2006)
July (2006)
June (2006)
May (2006)
April (2006)
March (2006)
February (2006)
January (2006)
December (2005)
October (2005)
April (2005)
December (1969)

124

ROR

High Performance Full Text Search for Database Content (MYSQL)


MySQL Full Text Search

? Available only for MyISAM tables
? Natural Language Search and boolean search
? Query Expansion support
? ft_min_word_len – minimum 3 char per word by default
? Extensive stop word list by default
– Containing «zero», «talk» etc
? Frequency based ranking
– Distance between words is not counted

MySQL Full Text Search Example

mysql> CREATE TABLE articles (
    ->   id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
    ->   title VARCHAR(200),
    ->   body TEXT,
    ->   FULLTEXT (title,body)
    -> );
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO articles (title,body) VALUES
    -> ('MySQL Tutorial','DBMS stands for DataBase ...'),
    -> ('How To Use MySQL Well','After you went through a ...'),
    -> ('Optimizing MySQL','In this tutorial we will show ...'),
    -> ('1001 MySQL Tricks','1. Never run mysqld as root. 2. ...'),
    -> ('MySQL vs. YourSQL','In the following database comparison ...'),
    -> ('MySQL Security','When configured properly, MySQL ...');
Query OK, 6 rows affected (0.00 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM articles
    -> WHERE MATCH (title,body)
    -> AGAINST ('database' IN NATURAL LANGUAGE MODE);
+----+-------------------+------------------------------------------+
| id | title             | body                                     |
+----+-------------------+------------------------------------------+
|  5 | MySQL vs. YourSQL | In the following database comparison ... |
|  1 | MySQL Tutorial    | DBMS stands for DataBase ...             |
+----+-------------------+------------------------------------------+
2 rows in set (0.00 sec)

 

MySQL Full Text Search Performance

? Index (and better data) should fit in memory
– And key_buffer_size sized appropriately
? Stop words list should contain most frequent
keywords
– But make sure you do not need to search by them
? Lowering ft_min_word_len decreases
performance
– Low length keywords are often frequent
? OPTIMIZE TABLE
– updates/deletes fragment  index


Avoid Count and Limit

? Avoid counting number of matches
– select title from cont where match (title,content) against ("democratic president candidate") limit 10;
? Takes 0.25 sec
– select count(*)  from cont where match (title,content) against  ("democratic president candidate")
? Takes 16 min 13.41 sec
? Avoid Sorting
– By separate field as «date» or by self computed relevance in boolean search
– select title,match(title) against ("post office" in boolean mode)*10 + match(content) against("post office" in boolean mode) relevance  from
cont where match (title,content) against ("post office" in boolean mode)
order by relevance desc  limit 10;
? Takes 3 min 38.35 sec

Be careful with WHERE and LIMIT
? Be care careful with extra where clause if it filters out a lot of matches.
– select title  from cont where match (title,content) against ("global service
company") and date_added>unix_timestamp("2006-07-18 18:00:00")
limit 10;
? Takes  26 min 43.59 sec
? Large LIMIT offset slow down dramatically
– select title  from cont where match (title,content) against ("computer game industry") limit 1000,10;
? Takes  9.99 sec

GROUP BY and Phrase Search
? GROUP BY requires all results so it is slow as well
– select count(*) cnt,date(from_unixtime(date_added))   from cont where
match (title,content) against ("computer game industry") group by date(from_unixtime(date_added));
? Takes 26 min 38.76 sec
– For example if we want to graph number of matches over time.
? Phrase Search makes Boolean Search pretty
slow
– select title   from cont where match (title,content) against ('"bill gates will"' in boolean mode) limit 10;
? Takes 18.08 sec
– MySQL has to look at data rows which match words to check if there is
phrase match so it is especially bad for frequent words in rare
combination




0.015625
Bilgi Portalı | Forum | Haberler | Havacılık | Daily News | Noticias