Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have an app that needs to update a large amount of data over a large number of entries. Basically it does some 7,000 inserts and/or updates but it takes a looooong time (like almost 9 minutes... averaging about 0.08 seconds per query). Essentially I'm looking for general speed ups to making multiple such requests (I'm not expecting a specific answer to my vague example... that's just to, hopefully, help explain).

Here are some samples from profiling the requests:

SELECT `habitable_planets`.* FROM `habitable_planets` WHERE (timestamp = '2010-10-15T07:30:00-07:00') AND (planet_id = '2010_Gl_581_c')

INSERT INTO `habitable_planets` (`planet_id`, `timestamp`, `weather_air_temp`, `weather_cell_temp`, `weather_irradiance`, `weather_wind_float`, `biolumin_to_date`, `biolumin_detected`, `craft_energy_usage`, `craft_energy_consumed_to_date`) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)

SELECT `habitable_planets`.* FROM `habitable_planets` WHERE (timestamp = '2010-10-15T07:45:00-07:00') AND (planet_id = '2010_Gl_581_c')

INSERT INTO `habitable_planets` (`planet_id`, `timestamp`, `weather_air_temp`, `weather_cell_temp`, `weather_irradiance`, `weather_wind_float`, `biolumin_to_date`, `biolumin_detected`, `craft_energy_usage`, `craft_energy_consumed_to_date`) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)

Repeat ad nauseam (well, about 7,000 times). This is an update that collects data generated at intervals over a 24 hour period and then makes a massive update to the database once per day. Given the limited bit I've shown you are there any suggestions for speeding this process up?

For example... would it make sense to, rather than do a select for each timestamp, do one select for a range all at once and then iterate over them in the script?

Vaguely like:

SELECT `habitable_planets`.* FROM `habitable_planets` WHERE (planet_id = '2010_Gl_581_c')

assign that result to $foo and then do:

foreach ($foo as $bar)
{
    if ($bar['timestamp'] == $baz) // where $baz is the needed timestamp
    {
    // do the insert here
    }
}

EDIT: To add a bit to this, one thing that improved responsiveness in my situation was to change a bunch of code that checked for a record existing and either did an insert or an update depending on the result into using an INSERT... ON DUPLICATE KEY UPDATE sql query. This resulted in about a 30% speed gain in my particular case because it cut at least one trip to the database out of the equation and over thousands of requests this really adds up.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
221 views
Welcome To Ask or Share your Answers For Others

1 Answer

Some useful links:

From MySQL Documentation:

Speed of INSERT Statements says:

  • If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time. This is considerably faster (many times faster in some cases) than using separate single-row INSERT statements. If you are adding data to a nonempty table, you can tune the bulk_insert_buffer_size variable to make data insertion even faster.

  • If multiple clients are inserting a lot of rows, you can get higher speed by using the INSERT DELAYED statement.

  • For a MyISAM table, you can use concurrent inserts to add rows at the same time that SELECT statements are running, if there are no deleted rows in middle of the data file.

  • When loading a table from a text file, use LOAD DATA INFILE. This is usually 20 times faster than using INSERT statements.

  • With some extra work, it is possible to make LOAD DATA INFILE run even faster for a MyISAM table when the table has many indexes.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...