Skip to content

Add configurable TTL for every Lock #3095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jkubrynski opened this issue Jul 25, 2019 · 12 comments
Open

Add configurable TTL for every Lock #3095

jkubrynski opened this issue Jul 25, 2019 · 12 comments

Comments

@jkubrynski
Copy link

We're using the LockRegistryLeaderInitiator mechanism to control leader election in the cluster backed with Hazelcast. I see that it uses tryLock(long time, TimeUnit unit) method without specifying the maximum lease time. In fact, it's possible that without graceful shutdown the instance won't release the lock and then no other leader will be elected. All instances then are logging
Acquiring the lock for LockContext{role=my-app-leader, id={{app-uuid}}, isLeader=false}

Without manually forcing unlock the whole cluster is stuck. I know that the java.util.concurrent.locks.Lock doesn't offer locking for a given time, but it's supported for example with the com.hazelcast.core.ILock.tryLock(long time, TimeUnit unit, long leaseTime, TimeUnit leaseUnit) method.

Maybe simple possibility to override LeaderSelector.call lock acquisition part could be extracted so users can override it with and provide max lease time for example? It could be also done for example in the spring-integration-hazelcast module.

@garyrussell
Copy link
Contributor

One workaround would be to provide a custom implementation of HazelcastLockRegistry that returns a LockWrapper (implements Lock - there are only a handful of methods).

Have all the methods delegate to the underlying Lock except tryLock(time, unit) which can call the extended tryLock() method.

I think I would prefer that approach over changing the leader initiator.

@garyrussell
Copy link
Contributor

I wouldn't object to making such a change directly in the HazelcastLockRegistry itself (optionally). Feel free to make a contribution.

@jkubrynski
Copy link
Author

I missed the option with the custom LockRegistry and it's definitely the simplest one. I will discuss the timings with the team and will try to submit a PR.

@jkubrynski
Copy link
Author

I've just looked at the code, and the problem is that LockRegistry returns the java.util.concurrent.locks.Lock object. So to make this walkaround useful we'd need to proxy the ILock object with tryLock method delegated to the one with leaseTime set to some default. It will work but won't be a clean solution. And the only clean solution I see there is to extract custom Lock interface or extend LockRegistry with a tryLock method. WDYT?

@garyrussell
Copy link
Contributor

garyrussell commented Jul 25, 2019

Right - that's what I meant by LockWrapper. Something like this...

public class HazelcastLockRegistry implements LockRegistry {

	private final HazelcastInstance client;

	private final long leaseTime;

	private final TimeUnit leaseTimeUnit;

	public HazelcastLockRegistry(HazelcastInstance hazelcastInstance) {
		this(hazelcastInstance, 0, null);
	}

	public HazelcastLockRegistry(HazelcastInstance hazelcastInstance, long leaseTime, TimeUnit leaseTimeUnit) {
		Assert.notNull(hazelcastInstance, "'hazelcastInstance' must not be null");
		this.client = hazelcastInstance;
		this.leaseTime = leaseTime;
		this.leaseTimeUnit = leaseTimeUnit;
	}

	@Override
	public Lock obtain(Object lockKey) {
		Assert.isInstanceOf(String.class, lockKey);
		ILock lock = this.client.getLock((String) lockKey);
		if (this.leaseTimeUnit == null) {
			return lock;
		}
		else {
			return new LockWrapper(lock, this.leaseTime, this.leaseTimeUnit);
		}
	}

	private static final class LockWrapper implements Lock {

		private final ILock lock;

		private final long leaseTime;

		private final TimeUnit leaseTimeUnit;

		LockWrapper(ILock lock, long leaseTime, TimeUnit leaseTimeUnit) {
			this.lock = lock;
			this.leaseTime = leaseTime;
			this.leaseTimeUnit = leaseTimeUnit;
		}

		@Override
		public void lock() {
			this.lock.lock();
		}

		@Override
		public void lockInterruptibly() throws InterruptedException {
			this.lock.lockInterruptibly();
		}

		@Override
		public boolean tryLock() {
			return this.lock.tryLock();
		}

		@Override
		public boolean tryLock(long time, TimeUnit unit) throws InterruptedException {
			return this.lock.tryLock(time, unit, this.leaseTime, this.leaseTimeUnit);
		}

		@Override
		public void unlock() {
			this.lock.unlock();
		}

		@Override
		public Condition newCondition() {
			return this.lock.newCondition();
		}

	}

}

@artembilan
Copy link
Member

I think it's time to change a LockRegistry.obtain() contract with such a Duration leaseTime. Most of the implementations for distributed locks really provide that functionality to avoid the described situation with dead lock.

I agree that it may require some internal changes for every single implementation, but we need to align with protocol requirements somehow anyway...

Yes, we can start with Hazelcast, but that is already a matter of the Spring Integration Extensions project: https://github.com/spring-projects/spring-integration-extensions/tree/master/spring-integration-hazelcast

@garyrussell
Copy link
Contributor

garyrussell commented Jul 30, 2019

obtain() only gets a lock from the registry (based on the key); it doesn't actually acquire the lock.

We'd need an integration lock extension like hazelcast's ILock - although I see that is deprecated now (in 3.12) in favor of FencedLock.

@jkubrynski
Copy link
Author

From my perspective LockRegistry.obtain() has a clear responsibility of acquiring the lock. How and for how long can be the responsibility of the underlying infrastructure. I've implemented the solution proposed by @garyrussell and it works like a charm on production.

@artembilan
Copy link
Member

OK. If you both agree that common private final long leaseTime; value as a configuration option is enough, we can go ahead into that direction everywhere needed.

@artembilan
Copy link
Member

What I see in the RedisLockRegistry is called expireAfter and used in the Redis script like: redis.call('PEXPIRE', KEYS[1], ARGV[2]).
The JDBC solution provides for us a ttl option on the DefaultLockRepository. Every call to the acquire() initiates a deleteExpired().
The Gemfire one is based on the Cache.getLockLease().
At the same time doesn't look like Zookeeper + Curator Framework provide some option for leasing...

So, I guess we are good in this project for all the implementation.
Since the original concern is about a Hazelcast implementation, moving this issue into an appropriate project.

Will continue discussion over there.

Thanks for understanding!

@artembilan artembilan transferred this issue from spring-projects/spring-integration Aug 21, 2019
@artembilan
Copy link
Member

Hm... I don't see any big problem with casting a LockRegistry.obtain() result into the ILock and perform many useful Hazelcast features on it, including desired tryLock(long time, TimeUnit unit, long leaseTime, TimeUnit leaseUnit) and lock(long leaseTime, TimeUnit timeUnit);. With the LockWrapper we need to keep ILock contract for possible casting in the end-user. But if that is the fact, we don't need any additional options, since, essentially, we just going to break ILock contract implementation:

@Override
public boolean tryLock(long timeout, TimeUnit unit) throws InterruptedException {
       return tryLock(timeout, unit, Long.MAX_VALUE, null);
}

So, knowing that you deal with Hazelcast, just be sure to cast into ILock and go ahead with leaseTime requirements!

I treat this as Works as Designed and will keep it open only until your feedback.

Thank you for understanding!

@artembilan artembilan transferred this issue from spring-projects/spring-integration-extensions Oct 30, 2019
@artembilan artembilan changed the title Leader election lock is acquired for 9223370472786555805 ms Add configurable TTL for every Lock Oct 30, 2019
@artembilan
Copy link
Member

Moved back to Spring integration since it looks like we need to provide a DistributedLock contract in other LockRegistry implementations.

Move info from the https://jira.spring.io/browse/INT-4399

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants