Comparison of Software and Hardware Transactional Memory
Transactional memory is poised to take parallel programming a step higher by making it more efficient and much easier to achieve, compared to traditional ‘semaphore’ locking. This is because transactional memory is easier to handle when tasks are divided into several free threads, especially when these threads do not have common access to data. This implies that each section can operate on a processor core and that there is no connection between cores. It can be challenging when different task sections are not totally free – that is, several threads are forced to upgrade a singly shared value. The traditional approach to this utilizes locks and every time that a thread changes a shared value, it requires a lock. In this case, it is not possible for any other thread to receive the lock if another thread possesses it. Instead, the thread must wait until the thread that has the lock can change the shared value. This is likely to require a complex computation, and to take an extended amount of time before eventually releasing the lock (Bright, 2011). The release of the lock allows the waiting thread to continue. While this is an effective system, it faces several major challenges. A key issues concerns updates to the shared value that occurs occasionally; therefore, making it rare for a thread to wait at ‘no time’ – the state in which the lock based system can be efficient (Alexandrescu, 2004). Nonetheless, this efficiency commonly disappears every time updates to the shared value are made. Threads take too much time waiting for a lock to appear and are unable to provide any use when in this state.
2.Lock vs. Lock Free Data Structures
While it may seem easy to handle a singly shared value, locks are difficult to use correctly and this is a challenge faced in real programs. For instance, a program with dual locks 1 and 2 is likely to encounter a problem called a ‘deadlock’. A deadlock is a case whereby two threads require two locks and only have the option of acquiring lock 1 then 2, or 2 followed by 1. As long as each thread needs the lock in the same order this will not present any issues; however, if one thread needs lock 1 and the other requires lock 2 at the same time, this can cause a deadlock. In this situation, the first thread waits for lock 2 to become free and the second waits for 1 to be free. This exchange makes it difficult for both to succeed and results in a deadlock. This issue might appear to be preventable and only likely to occur when a program has dual locks; however, it can become a challenge to ensure each section performs the right function when this becomes more complex.
3. Transactional Memory
It can argued that transactional memory can solve the problem of lock conflicts. In this case of a deadlock, the programmers could mark the sections of their programs which change the shared data, so that each of the marked blocks is implemented within a transaction. This means that either the whole block executes, or none of it does. The program can therefore identify the shared value without locking it. This allows for the program to conduct all the necessary operations and write back the value, eventually committing the transaction (Bright, 2011). The key transaction occurs with the commit operation in which transactional memory system ascertains that shared data has been changed after the commencement of an operation. If this is not identified then the commit updates, allowing the thread to go ahead with its function. In case the shared value has not been modified, the transaction stops and the function of the thread is rolled back (Detlefs et al., 2001). In this instance, the program retries the operation.
It can be seen, therefore, transactional memory has several merits over traditional semaphore locking. For example, transactional memory is optimistic; this infers that the threads are positioned to succeed and do not look forward to acquiring a lock. This is in case the other thread makes an attempt to conduct a concurrent operation (Detlefs et al., 2001). In an instance of concurrent modifications occurs when a single thread is forced to retry its function. In addition to this, there are no deadlocks in transactional memory. Transactional memory is a programming approach that programmers are familiar with; the transaction and rollback process is not new to those who have handled relational databases because they offer a similar set of features. Nonetheless, blocks facilitate the ease of developing large programs that are correct (Alexandrescu, 2004). Blocks with nested atomic blocks will perform the correct function, although this is not true in the case of lock-based data structures.
4.Merits of the Hardware
There has been little attention given towards hardware compared to software-based implementations. It has also been noted that most real processors seldom support transactional memory and, therefore, modifications are necessary (Maged, 2004). However, there are systems that use virtual machines to undertake their primary function and in this regard there are changes for the .NET and Java virtual machines (Bright, 2011). In other cases, systems use native codes that require certain special operations to allow access to the shared data. This enables the transactional memory software to ascertain that the right operations have occurred in the background. Such implementations have the advantages of ensuring that the programs that are produced are bug-free (Detlefs et al., 2001).
The data in cache contains a version tag whereas the cache itself can maintain many versions of the same data. The software sends a signal to the processor to commence a transaction and performs the necessary action. This then signals the processor to commit the work. If other threads have changed the data, resulting in many versions, the cache will refuse the transaction and the software will be forced to try again. Should other versions not be created, then the data is committed (Bright, 2011).
This facility is also applicable for speculative execution. A thread can commence execution with data available, whereas speculatively conducting important work – instead of waiting for upgraded versions of all data needed – might mean waiting for additional cores to complete computation (Alexandrescu, 2004). If the data was upgraded, then the work that is committed provides a performance boost; the work had been completed before the delivery of the final value. Should the data turn out to be stale, then the speculative work is rejected and re-deployed with the correct value (Bright, 2011).
A significant advantage that transactional memory has over traditional lock-based programs is that it support is an extension of a load-link or store conditional. Load-link is an undeveloped operation that can be implemented to build many types of thread-safe constructs (Maged, 2004). This comprises both mechanisms that are known, such as locks, and unconventional data structures, such as lists that can be changed by many threads at the same time without locking at all (Alexandrescu, 2004). The creation of software transactional memory is possible through the use of load-link or store conditional.
Load-link or store conditional contains two sections: firstly, it utilizes load link to recover the value from memory where it can then conduct the functions it needs to perform on that value. When there is a need to write a new value to the memory, this utilizes store conditional (Detlefs et al., 2001). Store conditional can only succeed if the memory value has not been changed after the load link. In case the value has been changed, the program has to return to the beginning and start again. These systems are restrictive because they do not follow writes to each memory bytes, but to the whole cache lines. This highlights the fact that store conditional has the potential to fail without modification of monitored value (Bright, 2011). Bright (2011) explains that store conditional is also most likely to fail if a context switch happens between the load link and store conditional. Transactional memory is a version of an enforced link load and store conditional; each thread can perform load link on several different memory locations (Maged, 2004). In addition to this, the commit operation does store conditional. This impacts on multiple locations at the same time, with each store either succeeding or failing (Bright, 2011).
In conclusion, a lock-free procedure is sure to sustain the progress of a thread executing a procedure. While some threads can be put on hold arbitrarily, one thread is certain to progress each move. The whole system can then make progress despite the fact that some threads might take longer than others. It can be seen, therefore, that the use of software or hardware transactional memory presents better ways of ensuring consistency of stored data when accessed and manipulated by several concurrent threads than traditional ‘semaphore’ locking. Consequently, lock-based programs fail to provide any of the above mentioned guarantees
Alexandrescu, A. (2004) Lock-Free Data Structures. Available at: http://www.drdobbs.com/lock-free-data-structures/184401865 [Accessed 12th March 2014].
Bright, P. (2011) IBM’s new transactional memory: make-or-break time for multithreaded revolution. Available at: http://arstechnica.com/gadgets/2011/08/ibms-new-transactional-memory-make-or-break-time-for-multithreaded-revolution/ [Accessed 12th March 2014].
Detlefs, D., Martin, P.A., Moir, M. & Steele, G.L., (2001) ‘The Twentieth Annual ACM Symposium on Principles of Distributed Computing’, in Lock-free Reference Counting, ACM Press: New York.
Maged, M.M. (2004) ‘Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation’, in Scalable Lock-free Dynamic Memory Allocation, ACM Press: New York.