扫码阅读
手机扫码阅读

10万条数据批量插入,到底怎么做才快?

13 2024-11-05

我们非常重视原创文章,为尊重知识产权并避免潜在的版权问题,我们在此提供文章的摘要供您初步了解。如果您想要查阅更为详尽的内容,访问作者的公众号页面获取完整文章。

查看原文:10万条数据批量插入,到底怎么做才快?
文章来源:
江南一点雨
扫码关注公众号
Article Summary

Article Summary

The article discusses two approaches for batch data insertion using JDBC: inserting data one by one in a loop with batch processing enabled, and generating a single SQL insert statement with multiple values. The efficiency of SQL execution and network I/O are considered to determine which approach is faster.

1. Thought Analysis

The first approach, using a loop for insertion, benefits from the PreparedStatement's pre-compilation feature in JDBC, which makes subsequent SQL executions faster. However, it suffers from potential network I/O overhead if the SQL server and the application server are not on the same machine.

The second approach involves creating one long SQL insert statement, which reduces network I/O as it requires only a single network operation. But, it has several downsides: the SQL statement can become too long, requiring partitioning; the PreparedStatement's pre-compilation advantage is lost as the SQL needs to be reparsed; and the database manager also needs time to parse such long SQL statements.

The core issue to consider is whether the time spent on network I/O outweighs the time saved by SQL insertion efficiency.

2. Data Testing

A test is conducted by inserting 50,000 records into a simple test table using a Spring Boot project with MyBatis and MySQL driver dependencies. A key parameter in the database connection URL, rewriteBatchedStatements, is set to true to enable batch execution of SQL by the MySQL JDBC driver.

2.1 First Approach Test

The test for the first approach involves batch processing with a single SqlSession. Using the BATCH mode, the test showcases the time taken to insert records one by one. Mapper and service classes are created for the test, which logs the time consumed for the insertion process.

2.2 Second Approach Test

For the second approach, the test involves inserting records using a single SQL statement with multiple values. The test compares the time taken for this method against the first approach.

In conclusion, the article reiterates the importance of considering the balance between network I/O and SQL execution efficiency when deciding on the batch insertion method. The author encourages further discussion and suggestions on this topic.

想要了解更多内容?

查看原文:10万条数据批量插入,到底怎么做才快?
文章来源:
江南一点雨
扫码关注公众号