How to Read Excel Files and Insert Data into a Database using Java (Apache POI & JDBC)
Importing Excel Data into a Database using Java
Importing data from spreadsheets into a relational database is a common requirement in enterprise applications. Whether you are migrating legacy data or building an import feature for employee records, doing it efficiently in Java requires a combination of Apache POI (for parsing Excel) and JDBC Batch Processing (for fast database inserts).
In this guide, we will build on your basic Apache POI setup, address common pitfalls (like handling header rows and cell types), and implement a high-performance batch insert mechanism to store 1,000+ records in seconds.
1. Adding the Required Dependencies
To read modern Excel formats (.xlsx), you need the poi-ooxml library. Ensure your pom.xml has the following dependency:
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>5.4.1</version>
</dependency>Note: You will also need the JDBC driver for your specific database (e.g., MySQL, PostgreSQL, or Oracle).
2. The Challenge: Why Simple Loops Fail in Production
While your initial code successfully reads cell values, executing a database INSERT query inside that for loop for every row is a performance anti-pattern. If you have 1,000 records, your application will make 1,000 round-trips to the database, resulting in slow execution times.
Instead, we should use JDBC Batch Processing to send all 1,000 records to the database in a single network trip.
3. Implementing the Complete Solution
Here is a robust, production-ready Java class that safely reads the Excel file, skips the header row, handles potential null cells, and performs a batch insert into a database.
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import java.io.FileInputStream;
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
public class ExcelToDatabaseImporter {
private static final String DB_URL = "jdbc:mysql://localhost:3306/your_database";
private static final String DB_USER = "root";
private static final String DB_PASSWORD = "password";
public static void main(String[] args) {
String excelFilePath = "employees.xlsx";
String insertSQL = "INSERT INTO employees (emp_id, name, email, salary) VALUES (?, ?, ?, ?)";
// Try-with-resources automatically closes connections, statements, and file streams
try (FileInputStream fis = new FileInputStream(excelFilePath);
Workbook workbook = new XSSFWorkbook(fis);
Connection connection = DriverManager.getConnection(DB_URL, DB_USER, DB_PASSWORD);
PreparedStatement preparedStatement = connection.prepareStatement(insertSQL)) {
// Disable auto-commit to manage transactions manually for batching
connection.setAutoCommit(false);
Sheet sheet = workbook.getSheetAt(0);
int batchSize = 100;
int count = 0;
for (Row row : sheet) {
// Skip the header row (Row 0)
if (row.getRowNum() == 0) {
continue;
}
try {
// Extract data safely
int empId = (int) row.getCell(0).getNumericCellValue();
String name = row.getCell(1).getStringCellValue();
String email = row.getCell(2).getStringCellValue();
double salary = row.getCell(3).getNumericCellValue();
// Set parameters for PreparedStatement
preparedStatement.setInt(1, empId);
preparedStatement.setString(2, name);
preparedStatement.setString(3, email);
preparedStatement.setDouble(4, salary);
// Add to batch
preparedStatement.addBatch();
count++;
// Execute batch every 100 records to avoid memory issues
if (count % batchSize == 0) {
preparedStatement.executeBatch();
}
} catch (NullPointerException | IllegalStateException e) {
System.err.println("Skipping invalid row at index: " + row.getRowNum() + " due to error: " + e.getMessage());
}
}
// Execute remaining records in the batch
preparedStatement.executeBatch();
connection.commit(); // Commit transaction
System.out.println("Successfully imported " + count + " employee records.");
} catch (IOException | SQLException e) {
e.printStackTrace();
}
}
}4. Key Improvements & Best Practices
- Try-With-Resources: Both Apache POI workbooks and JDBC connections hold system resources. Using
try-with-resourcesensures they are closed automatically, avoiding memory leaks. - Skipping Headers: We use
row.getRowNum() == 0to skip the first row, which typically contains column titles like "Employee ID", "Name", etc. - Batch Processing: Setting
connection.setAutoCommit(false)and executingpreparedStatement.executeBatch()drastically improves performance, cutting execution time from minutes to milliseconds. - Error Handling: Wrapping the cell extraction in a
try-catchblock prevents a single malformed row (like a missing email or text in a numeric column) from crashing the entire import process.
5. Going Further: Handling Massive Files
If your Excel file scales from 1,000 records to 100,000+ records, loading the entire file into memory using XSSFWorkbook can cause an OutOfMemoryError. For extremely large files, consider using Apache POI's event-driven SXSSF (Streaming Usermodel API) or libraries like fastexcel or EasyExcel designed for low-memory footprints.