Tuesday, 20 May 2025

Timestamp-Based Protocols

Timestamp-Based Protocols

  • Each transaction Tis issued a timestamp TS(Ti) when it enters the system.

       Each transaction has a unique timestamp

       Newer transactions have timestamps strictly greater than earlier ones

       Timestamp could be based on a logical counter

§  Real time may not be unique

§  Can use (wall-clock time, logical counter) to ensure

  • Timestamp-based protocols manage concurrent execution such that
          time-stamp order = serializability order
  • Several alternative protocols based on timestamps

The timestamp ordering (TSO) protocol

  • Maintains for each data Q two timestamp values:

       W-timestamp(Q) is the largest time-stamp of any transaction that executed write(Q) successfully.

       R-timestamp(Q) is the largest time-stamp of any transaction that executed read(Q) successfully.

  • Imposes rules on read and write operations to ensure that

       Any conflicting  operations are executed in timestamp order

Out of order operations cause transaction rollback

Suppose a transaction Ti issues a read(Q)

1.   If TS(Ti) < W-timestamp(Q), then Ti needs to read a value of Q  that

      was already overwritten.

  • Hence, the read operation is rejected, and Ti  is rolled back.

2.   If TS(Ti) ³  W-timestamp(Q), then the read operation is executed, and

      R-timestamp(Q) is set to

                 max(R-timestamp(Q), TS(Ti)).

Suppose that transaction Ti issues write(Q).

1.   If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing

      was needed previously, and the system assumed that that value

      would never be produced.

Ø  Hence, the write operation is rejected, and Ti is rolled back.

2.   If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an

      obsolete value of Q.

Ø  Hence, this write operation is rejected, and Ti is rolled back.

3.   Otherwise, the  write operation is executed, and W-timestamp(Q) is

      set to TS(Ti).

 

n        Recoverable schedule — if a transaction Tj reads a data items previously written by a transaction Ti , the commit operation of Ti  appears before the commit operation of Tj.

The following schedule (Schedule 11) is not recoverable if T9 commits immediately after the read



If T8 should abort, T9 would have read (and possibly shown to the user) an inconsistent database state.  Hence database must ensure that schedules are recoverable

Cascading rollback – a single transaction failure leads to a series of transaction rollbacks.  Consider the following schedule where none of the transactions has yet committed (so the schedule is recoverable)






n        If T10 fails, T11 and T12 must also be rolled back.

n        Can lead to the undoing of a significant amount of work

n        Cascadeless schedules — cascading rollbacks cannot occur; for each pair of transactions Ti and Tj such that Tj  reads a data item previously written by Ti, the commit operation of Ti  appears before the read operation of Tj.

n        Every cascadeless schedule is also recoverable

n        It is desirable to restrict the schedules to those that are cascadeless

System is deadlocked if there is a set of transactions such that every transaction in the set is waiting for another transaction in the set.











 

Sunday, 2 July 2023

Time Stamp Ordering Protocol

Time Stamp Ordering Protocol:


A timestamp is a tag that can be attached to any transaction or any data item, which denotes a specific time on which the transaction or the data item had been used in any way. A timestamp can be implemented in 2 ways. One is to directly assign the current value of the clock to the transaction or data item. The other is to attach the value of a logical counter that keeps increment as new timestamps are required.

The timestamp of a data item can be of 2 types:


(i) W-timestamp(X): 

This means the latest time when the data item X has been written into.


(ii) R-timestamp(X):


This means the latest time when the data item X has been read from. 

These 2 timestamps are updated each time a successful read/write operation is performed on the data item X. 

Every transaction is issued a timestamp based on when it enters the system. Suppose, if an old transaction Ti has timestamp TS(Ti), a new transaction Tj is assigned timestamp TS(Tj) such that TS(Ti) < TS(Tj).The protocol manages concurrent execution such that the timestamps determine the serializability order. 

The timestamp ordering protocol ensures that any conflicting read and write operations are executed in timestamp order. Whenever some Transaction T tries to issue a R_item(X) or a W_item(X), the Basic TO algorithm compares the timestamp of T with R_TS(X) & W_TS(X) to ensure that the Timestamp order is not violated. This describes the Basic TO protocol in following two cases.


Whenever a Transaction T issues a W_item(X) operation, check the following conditions:


• If R_TS(X) > TS(T) or if W_TS(X) > TS(T), then abort and rollback T and reject the operation. else, Execute W_item(X) operation of T and set W_TS(X) to TS(T).


Whenever a Transaction T issues a R_item(X) operation, check the following conditions:


•If W_TS(X) > TS(T), then abort and reject T and reject the operation, else If W_TS(X) <= TS(T), then execute the R_item(X) operation of T and set R_TS(X) to the larger of TS(T) and current R_TS(X).


Whenever the Basic TO algorithm detects two conflicting operation that occur in incorrect order, it rejects the later of the two operation by aborting the Transaction that issued it. Schedules produced by Basic TO are guaranteed to be conflict serializable. Already discussed that using Timestamp, can ensure that our schedule will be deadlock free. One drawback of Basic TO protocol is that it Cascading Rollback is still possible. Suppose we have a Transaction T1 and T2 has used a value writtenby T1. If T1 is aborted and resubmitted to the system then, T must also be aborted and rolled back. So the problem of Cascading aborts still prevails. 

 

Validation Based Protocol

Validation phase is also known as optimistic concurrency control technique. In the validation based protocol, the transaction is executed in the following three phases:


1. Read phase: In this phase, the transaction T is read and executed. It is used to read the value of various data items and stores them in temporary local variables. It can perform all the write operations on temporary variables without an update to the actual database.


2. Validation phase: In this phase, the temporary variable value will be validated against the actual data to see if it violates the serializability.


3. Write phase: If the validation of the transaction is validated, then the temporary results are written to the database or system otherwise the transaction is rolled back.


Here each phase has the following different timestamps:


Start(Ti): It contains the time when Ti started its execution.


Validation (Ti): It contains the time when Ti finishes its read phase and starts its validation phase. 

Finish(Ti): It contains the time when Ti finishes its write phase.


o This protocol is used to determine the time stamp for the transaction for serialization using the time stamp of the validation phase, as it is the actual phase which determines if the transaction will commit or rollback.


o Hence TS(T) = validation(T).


o The serializability is determined during the validation process. It can't be decided in advance.


o While executing the transaction, it ensures a greater degree of concurrency and also less number of conflicts.


o Thus it contains transactions which have less number of rollbacks 

 

Multiple Granularity

 
Let's start by understanding the meaning of granularity.


Granularity: It is the size of data item allowed to lock.


Multiple Granularity:
o It can be defined as hierarchically breaking up the database into blocks which can be
locked.
o The Multiple Granularity protocol enhances concurrency and reduces lock overhead.
o It maintains the track of what to lock and how to lock.
o It makes easy to decide either to lock a data item or to unlock a data item. This type
of hierarchy can be graphically represented as a tree.


For example: Consider a tree which has four levels of nodes.
o The first level or higher level shows the entire database.
o The second level represents a node of type area. The higher level database consists of
exactly these areas.
o The area consists of children nodes which are known as files. No file can be present in 

more than one area.
o Finally, each file contains child nodes known as records. The file has exactly those
records that are its child nodes. No records represent in more than one file.
o Hence, the levels of the tree starting from the top level are as follows:
1. Database
2. Area
3. File
4. Record

Concurrency control

Concurrency control is provided in a database to:


(i) Enforce isolation among transactions.
(ii) Preserve database consistency through consistency preserving execution of transactions.
(iii) Resolve read-write and write-read conflicts.


Various concurrency control techniques are:


(1) Two Phase Locking (2PL) Protocol
(2) Time Stamp Ordering Protocol 

Lock-Based Protocol

Locking is an operation which secures: permission to read, OR permission to write a data item. Two phase locking is a process used to gain ownership of shared resources without creating the possibility of deadlock.


The 3 activities taking place in the two phase update algorithm are:
(i) Lock Acquisition
(ii) Modification of Data
(iii) Release Lock

 

In this type of protocol, any transaction cannot read or write data until it acquires an appropriate lock on it. There are two types of lock:

1. Shared lock:

  • It is also known as a Read-only lock. In a shared lock, the data item can only read by the transaction.
  • It can be shared between the transactions because when the transaction holds a lock, then it can't update the data on the data item.

2. Exclusive lock:

  • In the exclusive lock, the data item can be both reads as well as written by the transaction.
  • This lock is exclusive, and in this lock, multiple transactions do not modify the same data simultaneously.

 

1. Two-Phase Locking Protocol:

  • The two-phase locking protocol divides the execution phase of the transaction into three parts.
  • In the first part, when the execution of the transaction starts, it seeks permission for the lock it requires.
  • In the second part, the transaction acquires all the locks. The third phase is started as soon as the transaction releases its first lock.
  • In the third phase, the transaction cannot demand any new locks. It only releases the acquired locks.

 

 Two phase locking prevents deadlock from occurring in distributed systems by releasing all the resources it has acquired, if it is not possible to acquire all the resources required without waiting for another process to finish using a lock.


This means that no process is ever in a state where it is holding some shared resources, and waiting for another process to release a shared resource which it requires. This means that deadlock cannot occur due to resource contention. 

A transaction is said to follow Two Phase Locking protocol if Locking and Unlocking can be done in two phases.

 

DBMS Lock-Based Protocol


I. Growing Phase: 

In this phase a transaction can only acquire locks but cannot release any lock. The point when a transaction acquires all the locks it needs is called the Lock Point.


II. Shrinking Phase: 


In this phase a transaction can only release locks but cannot acquire any Note – If lock conversion is allowed, then upgrading of lock( from S(a) to X(a) ) is allowed in Growing Phase and downgrading of lock (from X(a) to S(a)) must be done in shrinking phase 

 

Let’s see a transaction implementing 2-PL. 

 


 T1T2
1lock-S(A) 
2 lock-S(A)
3lock-X(B) 
4…….……
5Unlock(A) 
6 Lock-X(C)
7Unlock(B) 
8 Unlock(A)
9 Unlock(C)
10…….……
 

This is just a skeleton transaction that shows how unlocking and locking work with 2-PL. Note for: 
Transaction T1:

  • The growing Phase is from steps 1-3.
  • The shrinking Phase is from steps 5-7.
  • Lock Point at 3

Transaction T2:

  • The growing Phase is from steps 2-6.
  • The shrinking Phase is from steps 8-9.
  • Lock Point at 6
 

2-PL ensures serializability, but there are still some drawbacks of 2-PL. Let’s glance at the drawbacks: 

 

Friday, 26 May 2023

Serialization

 

Serializability

When multiple transactions are running concurrently then there is a possibility that the database may be left in an inconsistent state. Serializability is a concept that helps us to check which schedules are serializable. A serializable schedule is the one that always leaves the database in consistent state.

What is a serializable schedule?

A serializable schedule always leaves the database in consistent state. A serial schedule is always a serializable schedule because in serial schedule, a transaction only starts when the other transaction finished execution. However a non-serial schedule needs to be checked for Serializability.

A non-serial schedule of n number of transactions is said to be serializable schedule, if it is equivalent to the serial schedule of those n transactions. A serial schedule doesn’t allow concurrency, only one transaction executes at a time and the other starts when the already running transaction finished.

Types of Serializability 

There are two types of Serializability.

 1.         Conflict Serializability

 2.         View Serializability

 Conflict Serializability

 A schedule is called conflict serializable if we can convert it into a serial schedule after swapping its non-conflicting operations.

 Conflicting operations

 Two operations are said to be in conflict, if they satisfy all the following three conditions:

 1. Both the operations should belong to different transactions.

 2. Both the operations are working on same data item.

 3. At least one of the operations is a write operation

 View Serializability

 View Serializability is a process to find out that a given schedule is view serializable or not.

 To check whether a given schedule is view serializable, we need to check whether the given schedule is View Equivalent to its serial schedule

 Testing of Serializability

 Serialization Graph is used to test the Serializability of a schedule.

 Assume a schedule S. For S, we construct a graph known as precedence graph. This graph has a pair G = (V, E), where V consists a set of vertices, and E consists a set of edges. The set of vertices is used to contain all the transactions participating in the schedule. The set of edges is used to contain all edges Ti ->Tj for which one of the three conditions holds:

 Create a node Ti → Tj if Ti executes write (Q) before Tj executes read (Q).

 Create a node Ti → Tj if Ti executes read (Q) before Tj executes write (Q).

Create a node Ti → Tj if Ti executes write (Q) before Tj executes write (Q).

If a precedence graph contains a single edge Ti → Tj, then all the instructions of Ti are executed before the first instruction of Tj is executed. If a precedence graph for schedule S contains a cycle, then S is non-serializable. If the precedence graph has no cycle, then S is known as serializable.

 

 

 


Read(A): In T1, no subsequent writes to A, so no new edges

Read(B): In T2, no subsequent writes to B, so no new edges

Read(C): In T3, no subsequent writes to C, so no new edges

Write(B): B is subsequently read by T3, so add edge T2 → T3

Write(C): C is subsequently read by T1, so add edge T3 → T1

 Write(A): A is subsequently read by T2, so add edge T1 → T2

 Write(A): In T2, no subsequent reads to A, so no new edges

 Write(C): In T1, no subsequent reads to C, so no new edges

 Write(B): In T3, no subsequent reads to B, so no new edges 


Precedence graph for schedule S1:

 

 


View Serializability

  • A schedule will view serializable if it is view equivalent to a serial schedule.
  • If a schedule is conflict serializable, then it will be view serializable.
  • The view serializable which does not conflict serializable contains blind writes.

View Equivalent

Two schedules S1 and S2 are said to be view equivalent if they satisfy the following conditions:

1. Initial Read

An initial read of both schedules must be the same. Suppose two schedule S1 and S2. In schedule S1, if a transaction T1 is reading the data item A, then in S2, transaction T1 should also read A.





Above two schedules are view equivalent because Initial read operation in S1 is done by T1 and in S2 it is also done by T1.

2. Updated Read

In schedule S1, if Ti is reading A which is updated by Tj then in S2 also, Ti should read A which is updated by Tj.

 




Above two schedules are not view equal because, in S1, T3 is reading A updated by T2 and in S2, T3 is reading A updated by T1.

3. Final Write

A final write must be the same between both the schedules. In schedule S1, if a transaction T1 updates A at last then in S2, final writes operations should also be done by T1.





Above two schedules is view equal because Final write operation in S1 is done by T3 and in S2, the final write operation is also done by T3


Example:


Schedule S

With 3 transactions, the total number of possible schedule

1.     3! = 6  

2.     S1 = <T1 T2 T3>  

3.     S2 = <T1 T3 T2>  

4.     S3 = <T2 T3 T1>  

5.     S4 = <T2 T1 T3>  

6.     S5 = <T3 T1 T2>  

7.     S6 = <T3 T2 T1>  

Taking first schedule S1:

ex


Timestamp-Based Protocols

Timestamp-Based Protocols Each transaction T i  is issued a timestamp TS( T i ) when it enters the system. •        Each transac...