Monday, March 17, 2014

General Facts

Subroutine:
In computer programming, a subroutine is a sequence of program instructions that perform a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed. Subprograms may be defined within programs, or separately in libraries that can be used by multiple programs.

In different programming languages a subroutine may be called a procedure, a function, a routine, a method, or a subprogram. The generic term callable unit is sometimes used.


Reentrancy:
In computing, a computer program or subroutine is called reentrant if it can be interrupted in the middle of its execution and then safely called again ("re-entered") before its previous invocations complete execution. The interruption could be caused by an internal action such as a jump or call, or by an external action such as a hardware interrupt or signal. Once the reentered invocation completes, the previous invocations will resume correct execution.

This definition originates from single-threaded programming environments where the flow of control could be interrupted by a hardware interrupt and transferred to an interrupt service routine (ISR). Any subroutine used by the ISR that could potentially have been executing when the interrupt was triggered should be reentrant. Often, subroutines accessible via the operating system kernel are not reentrant. Hence, interrupt service routines are limited in the actions they can perform; for instance, they are usually restricted from accessing the file system and sometimes even from allocating memory.

A subroutine that is directly or indirectly recursive should be reentrant. This policy is partially enforced by structured programming languages. However a subroutine can fail to be reentrant if it relies on a global variable to remain unchanged but that variable is modified when the subroutine is recursively invoked.

This definition of reentrancy differs from that of thread-safety in multi-threaded environments. A reentrant subroutine can achieve thread-safety, but being reentrant alone might not be sufficient to be thread-safe in all situations. Conversely, thread-safe code does not necessarily have to be reentrant (see below for examples).

Other terms used for reentrant programs include "pure procedure" or "sharable code".
Example[edit]

This is an example of a swap() function that fails to be reentrant (as well as failing to be thread-safe). As such, it should not have been used in the interrupt service routine isr():

int t;

void swap(int *x, int *y)
{
    t = *x;
    *x = *y;

    // hardware interrupt might invoke isr() here!
    *y = t;
}

void isr()
{
    int x = 1, y = 2;
    swap(&x, &y);
}
swap() could be made thread-safe by making t thread-local. It still fails to be reentrant, and this will continue to cause problems if isr() is called in the same context as a thread already executing swap().

The following (somewhat contrived) modification of the swap function, which is careful to leave the global data in a consistent state at the time it exits, is perfectly reentrant; however, it is not thread-safe since it does not ensure the global data is in a consistent state during execution:

int t;

void swap(int *x, int *y)
{
    int s;

    s = t; // save global variable
    t = *x;
    *x = *y;

    // hardware interrupt might invoke isr() here!
    *y = t;
    t = s; // restore global variable
}

void isr()
{
    int x = 1, y = 2;
    swap(&x, &y);
}
Source: en.wikipedia.org

Thread safety:
Thread safety is a computer programming concept applicable in the context of multi-threaded programs. A piece of code is thread-safe if it only manipulates shared data structures in a manner that guarantees safe execution by multiple threads at the same time. There are various strategies for making thread-safe data structures.
There are several approaches for avoiding race conditions to achieve thread safety. The first class of approaches focuses on avoiding shared state, and includes:

Re-entrancy 
Writing code in such a way that it can be partially executed by a thread, reexecuted by the same thread or simultaneously executed by another thread and still correctly complete the original execution. This requires the saving of state information in variables local to each execution, usually on a stack, instead of in static or global variables or other non-local state. All non-local state must be accessed through atomic operations and the data-structures must also be reentrant.

Thread-local storage 
Variables are localized so that each thread has its own private copy. These variables retain their values across subroutine and other code boundaries, and are thread-safe since they are local to each thread, even though the code which accesses them might be executed simultaneously by another thread.

The second class of approaches are synchronization-related, and are used in situations where shared state cannot be avoided:

Mutual exclusion
Access to shared data is serialized using mechanisms that ensure only one thread reads or writes to the shared data at any time. Incorporation of mutual exclusion needs to be well thought out, since improper usage can lead to side-effects like deadlocks, livelocks and resource starvation.

Atomic operations 
Shared data are accessed by using atomic operations which cannot be interrupted by other threads. This usually requires using special machine language instructions, which might be available in a run-time library. Since the operations are atomic, the shared data are always kept in a valid state, no matter how other threads access it. Atomic operations form the basis of many thread locking mechanisms, and are used to implement mutual exclusion primitives.

Immutable objects 
The state of an object cannot be changed after construction. This implies both that only read-only data is shared and that inherent thread safety is attained. Mutable (non-const) operations can then be implemented in such a way that they create new objects instead of modifying existing ones. This approach is used by the string implementations in Java, C# and Python.

Examples

In the following piece of C code, the function is thread-safe, but not reentrant:

#include <pthread.h>

int increment_counter ()
{
    static int counter = 0;
    static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

    pthread_mutex_lock(&mutex);

    // only allow one thread to increment at a time
    ++counter;
    // store value before any other threads increment it further
    int result = counter;

    pthread_mutex_unlock(&mutex);

    return result;
}
In the above, increment_counter can be called by different threads without any problem since a mutex is used to synchronize all access to the shared counter variable. But if the function is used in a reentrant interrupt handler and a second interrupt arises inside the function, the second routine will hang forever. As interrupt servicing can disable other interrupts, the whole system could suffer.

The same function can be implemented to be both thread-safe and reentrant using the lock-free atomics in C++11:

#include <atomic>

int increment_counter ()
{
static std::atomic<int> counter(0);

// increment is guaranteed to be done atomically
int result = ++counter;

return result;
}
Source: en.wikipedia.org



strtok()
#include <stdio.h>
#include <string.h>

int main ()
{
    char str[] ="- This, a sample string.";
    char * pch;
    printf ("Splitting string \"%s\" into tokens:\n",str);
    pch = strtok (str," ,.-");
    while (pch != NULL)
    {
        printf ("%s\n",pch);
        pch = strtok (NULL, " ,.-");
    }
    return 0;

}
Two things to know about strtok. As was mentioned, it "maintains internal state". Also, it messes up the string you feed it. Essentially, it will write a '\0' where it finds the token you supplied, and returns a pointer to the start of the string. Internally it maintains the location of the last token; and next time you call it, it starts from there.

The important corollary is that you cannot use strtok on a const char* "hello world"; type of string, since you will get an access violation when you modify contents of a const char* string.

The "good" thing about strtok is that it doesn't actually copy strings - so you don't need to manage additional memory allocation etc. But unless you understand the above, you will have trouble using it correctly.

Example - if you have "this,is,a,string", successive calls to strtok will generate pointers as follows (the ^ is the value returned). Note that the '\0' is added where the tokens are found; this means the source string is modified:

t  h  i  s  ,  i  s  ,  a  ,  s  t  r  i  n  g \0         this,is,a,string

t  h  i  s  \0 i  s  ,  a  ,  s  t  r  i  n  g \0         this
^
t  h  i  s  \0 i  s  \0 a  ,  s  t  r  i  n  g \0         is
               ^
t  h  i  s  \0 i  s  \0 a  \0 s  t  r  i  n  g \0         a
                        ^
t  h  i  s  \0 i  s  \0 a  \0 s  t  r  i  n  g \0         string
                              ^
Source: stackoverflow.com


No comments :

Post a Comment