Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance when polling replies in MessageChannelPartitionHandler #4135

Closed
lcmarvin opened this issue Jun 17, 2022 · 1 comment
Closed

Comments

@lcmarvin
Copy link
Contributor

Currently a for each loop is used in MessageChannelPartitionHandler to poll the partition StepExecutions.

for (Iterator<StepExecution> stepExecutionIterator = split.iterator(); stepExecutionIterator.hasNext();) {
	StepExecution curStepExecution = stepExecutionIterator.next();

	if (!result.contains(curStepExecution)) {
		StepExecution partitionStepExecution = jobExplorer
            .getStepExecution(managerStepExecution.getJobExecutionId(), curStepExecution.getId());

		if (!partitionStepExecution.getStatus().isRunning()) {
			result.add(partitionStepExecution);
		}
	}
}

When there are lots of partition StepExecutions, the method org.springframework.batch.core.explore.JobExplorer#getStepExecution will cause lots of repeated sql query on database. Take a look at the implementation of this method.

	@Nullable
	@Override
	public StepExecution getStepExecution(@Nullable Long jobExecutionId, @Nullable Long executionId) {
		JobExecution jobExecution = jobExecutionDao.getJobExecution(jobExecutionId);
		if (jobExecution == null) {
			return null;
		}
		getJobExecutionDependencies(jobExecution);
		StepExecution stepExecution = stepExecutionDao.getStepExecution(jobExecution, executionId);
		getStepExecutionDependencies(stepExecution);
		return stepExecution;
	}

The JobInstance, JobExecution, JobParameters, JobExecutionContext, and StepExecutions of the JobExecution are queried again and again, which can be simplified in my option.

So can we only query the needed StepExecution in this loop? I believe there will be performance improvement if we can simplify the query.

@fmbenhassine
Copy link
Contributor

This is a valid point, thank you for raising it!

The performance improvement will be addressed in #3790, which reports the same issue.

@fmbenhassine fmbenhassine added the status: duplicate Issues that are duplicates of other issues label Feb 22, 2023
@fmbenhassine fmbenhassine closed this as not planned Won't fix, can't repro, duplicate, stale Feb 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants