springboot如何导入excel百万数据呢?
下文笔者讲述SpringBoot导入excel中百万数据的方法及示例分享,如下所示
导入百万数据的挑战
我们都知道任何程序,导入百万数据,都不是一件容易的事情
需我们着重关注性能和内存的使用情况,下文笔者将讲述具体的导入方法
导入百万数据的步骤
使用Apache POI的`SXSSFWorkbook`可有效处理大数据量的Excel文件
因为它基于流式处理
不会将整个文件加载到内存中
以下是一个详细的步骤
1.添加依赖:
在`pom.xml`中添加Apache POI依赖
2.创建控制器:
处理文件上传请求
3.创建服务:
处理Excel文件读取和数据处理
4.使用SXSSFWorkbook:
高效读取大数据量的Excel文件
5.批量插入数据库:
使用JPA或MyBatis等ORM框架批量插入数据
以提高性能
=====================================================================
添加依赖
在`pom.xml`中添加以下依赖:
<dependencies>
<!-- Spring Boot Starter Web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- Spring Boot Starter Data JPA -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<!-- H2 Database (for testing) -->
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<scope>Runtime</scope>
</dependency>
<!-- Apache POI for Excel processing -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>5.2.3</version>
</dependency>
<!-- Apache POI for SXSSF (Streaming Usermodel API) -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml-schemas</artifactId>
<version>4.1.2</version>
</dependency>
<!-- XMLBeans for POI -->
<dependency>
<groupId>org.apache.xmlbeans</groupId>
<artifactId>xmlbeans</artifactId>
<version>5.1.1</version>
</dependency>
<!-- Commons Compress for POI -->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
<version>1.21</version>
</dependency>
</dependencies>
实体类创建
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.GenerationType;
import javax.persistence.Id;
@Entity
public class UserData {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
private String name;
private String email;
private String phone;
// Getters and Setters
public Long getId() {
return id;
}
public void setId(Long id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getEmail() {
return email;
}
public void setEmail(String email) {
this.email = email;
}
public String getPhone() {
return phone;
}
public void setPhone(String phone) {
this.phone = phone;
}
}
创建Repository接口
创建JPA Repository接口
处理数据库操作
import org.springframework.data.jpa.repository.JpaRepository;
public interface UserDataRepository extends JpaRepository<UserData, Long> {
// Custom query methods can be defined here if needed
}
创建Controller控制器
创建一个控制器来处理文件上传请求
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.multipart.MultipartFile;
@RestController
public class ExcelUploadController {
@Autowired
private ExcelService excelService;
@PostMapping("/upload")
public String uploadFile(@RequestParam("file") MultipartFile file) {
if (file.isEmpty()) {
return "Please select a file to upload.";
}
try {
excelService.processExcelFile(file);
return "File uploaded and processed successfully.";
} catch (Exception e) {
e.printStackTrace();
return "Failed to process the file.";
}
}
}
创建服务类处理程序
创建一个服务类 来处理Excel文件的读取和数据处理 并批量插入数据库 import org.apache.poi.ss.usermodel.*; import org.apache.poi.xssf.streaming.SXSSFWorkbook; import org.apache.poi.xssf.usermodel.XSSFWorkbook; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Service; import org.springframework.web.multipart.MultipartFile; import javax.transaction.Transactional; import java.io.InputStream; import java.util.Arraylist; import java.util.Iterator; import java.util.List; @Service public class ExcelService { @Autowired private UserDataRepository userDataRepository; @Transactional public void processExcelFile(MultipartFile file) throws Exception { try (InputStream inputStream = file.getInputStream(); Workbook workbook = new SXSSFWorkbook(new XSSFWorkbook(inputStream), 100)) { //内存中只保留100行数据 Sheet sheet = workbook.getSheetAt(0); Iterator<Row> rowIterator = sheet.iterator(); // Skip header row if present if (rowIterator.hasNext()) { rowIterator.next(); } List<UserData> userDataList = new ArrayList<>(); int batchSize = 1000; // Define batch size int count = 0; while (rowIterator.hasNext()) { Row row = rowIterator.next(); Iterator<Cell> cellIterator = row.cellIterator(); UserData userData = new UserData(); int cellIndex = 0; while (cellIterator.hasNext()) { Cell cell = cellIterator.next(); switch (cellIndex) { case 0: userData.setName(cell.getStringCellValue()); break; case 1: userData.setEmail(cell.getStringCellValue()); break; case 2: userData.setPhone(cell.getStringCellValue()); break; default: break; } cellIndex++; } userDataList.add(userData); count++; // Save batch if (count % batchSize == 0) { userDataRepository.saveAll(userDataList); userDataList.clear(); } } // Save remaining data if (!userDataList.isEmpty()) { userDataRepository.saveAll(userDataList); } // Flush and dispose of temporary files ((SXSSFWorkbook) workbook).dispose(); } } }
采用以上程序可完美的插入百万数据
可避免内存溢出
批量插入优化
- 批量大小:
通过设置`batchSize`来控制每次批量插入的数据量
可根据实际情况调整
- 事务管理:
使用`@Transactional`注解确保批量
插入操作在一个事务中进行
提高数据一致性和性能
版权声明
本文仅代表作者观点,不代表本站立场。
本文系作者授权发表,未经许可,不得转载。


