Limiting Component Instances

In such a case the component must ensure that only a single instance execute at a time, independent of whether or not an instance can be reused: a reusePolicy of “any” or “none” can still produce multiple concurrently executing instances of the component execution class.

A component may define an integer property named “maxConcurrent.” The value of this property determines how many concurrent instances of the component can execute at the same time. If not specified, the default value of this property is unlimited. However, if the component metamodel XML defines this property with the value of “1,” the system will guarantee that only one instance of the component executes at a time. As a result, code that is not thread-safe with respect to static data will execute without data corruption from multiple executing component instances.

When maxConcurrent is set to “1,” only a single execution of the component can proceed at a time. If there are multiple requests for the component to execute at the same time, they are queued and only a single execution will be allowed run. For highly concurrent and shared systems such as the Fiper distributed execution environment, the setting can cause a severe bottleneck and affect overall system throughput and performance, particularly for long-running components. The setting is not recommended unless it is required for proper execution behavior of the component.

In a Fiper environment, maximum concurrency of one is per JVM (each station can have one copy). If run-as security is activated, then maximum concurrency is one copy per user per station.

Process components must never have a maxConcurrent value of “1” because it can cause any model that uses the component in a nested fashion to deadlock. For example in the following model, the same component type is used at two levels of the model. If the process component had a maxConcurrent of “1,” the model would stop executing when it reached the second level of the model:

The maxConcurrent setting can be combined with the reusePolicy setting. When maxConcurrent is set to “1” and reusePolicy to “any,” the system will create only a single instance of the class for all executions of the component. Only a single execution will proceed at a time; when it completes, the same class instance will be used to run the next execution. The component and any native code it calls must be serially reusable to support this mode of execution. For more information, see Reusable and Persistent Components. In general, native code will not be serially reusable if it depends on the initial value of static data that are subsequently updated when the code runs. The next execution of the code will find the static data with prior values instead of the initial values. For more information, see Use of Native Code.

When a maxConcurrent setting of “1” is combined with a reusePolicy of “none,” a new component instance will be created for each execution of the component. When the execution is complete, the destroy() method will be called and the component instance will be dereferenced (and may be subsequently garbage collected). The next execution request will then create a new component instance and repeat the execution cycle.

Use of Native Code

When a component uses native code, the Java Virtual Machine will load and statically initialize the native code just one time. Even if the component class instance is dereferenced and even if garbage collected, the JVM does not unload the native code. The next use of the component, even if a new Java class instance is created, will continue to use the already loaded native code and static data. The reusePolicy affects only the reuse of Java classes and does not affect native code. As a result, any native code used in any component or plug-in must be serially reusable. In addition, the native code must not depend on the initial value of static data that is subsequently modified during execution. Such code must be modified to initialize static data explicitly for every execution of the component.

It is difficult to write Fortran code that can be safely used in an Isight component because most Fortran compilers create COMMON blocks as initialized data that are never reset once the program starts running. Special compiler options must be set when compiling Fortran code for use in a shared library called from Java to cause the compiler to store local variables on the stack instead of in a BLANK COMMON area.

There is a restriction on native code, not the Java code, because there can be only one instance of the native code that will be shared and reused by all instances of the Java component execution class.

The above limitation on native code applies only to native code that is directly invoked from Java through the Java Native Interface (JNI). It does not apply to native code that is launched as a separate process.

The following table summarizes the recommended settings based on the nature of the Java component executor class and the native code it invokes. If not specified in the component XML descriptor, the default settings are unlimited maxConcurrent and a reusePolicy of “none.”

Recommended Settings for Java Component Executor Class
Java Code	Native Code	MaxConcurrent	Reuse Policy
Thread Safe Serially Reusable	Thread Safe Serially Reusable	Unlimited	ANY
Thread Safe Not Serially Reusable	Thread Safe Serially Reusable	Unlimited	NONE
Not Thread Safe Serially Reusable	Serially Reusable	1	ANY
Not Thread Safe Not Serially Reusable	Serially Reusable	1	NONE

Thread-Safe Java Component. Has no Java STATIC data or uses thread synchronization when accessing STATIC data. Does not have to protect against multiple threads of execution in the same instance because system will always use a single thread for a component instance.

Serially Reusable Java Component. Has no class fields or resets the class field values for each execution.

Thread-Safe Native Code. Supports multiple concurrent threads of execution with appropriate protection of any data that is not allocated dynamically.

Serially Reusable Native Code. Resets—or does not depend—on the value of all storage before use. Static storage can be used if not modified during execution or if the static values are reset before use.

Job-Level Persistence

Occasionally, a component execution class may deliberately retain or accumulate data from one execution to the next. In general, retention of data is not recommended because it greatly restricts the flexibility of the infrastructure to assign component instances and computers for execution and can cause memory growth and have other unintended side effects. In addition, the system makes few guarantees about where and how a component is executed, so the component developer cannot make assumptions about how or if a given component instance will be reused.

Such special case components usually accumulate data associated with their use in a particular model. For example, some calculated data may be held from one execution to the next, with the assumption that each execution is part of the same job. In general, that assumption is not valid; the system may reuse a component instance for different jobs at any time, as well as, interleave executions from many jobs. However, the system does provide a means to declare that a component instance can be “scoped” to a single job. That is, the instance will be restricted to running only within the job in which it is first used. All subsequent executions of the component will be part of that same job. When the job ends, the component may be reused for another job if the reusePolicy is “any” or the component will be destroyed if the reusePolicy is “none.”

Designating a component scoped to a job is done by defining a component property named “reuseScope” with the value of “job.”

When a component has the reuseScope value of “job,” the maxConcurrent value still controls how many concurrent instances of the execution class will be used at the same time. If maxConcurrent is “1,” the single instance will be restricted to the first job in which it is used. Any request to execute the component from some other job will be queued until the first job completes, which can have a major impact on overall system throughput because it forces one job to wait for another job to complete. The component instance cannot be released until the entire job in which it was used completes, which may be a long time after the component execution is done. Therefore, it is highly recommended that components with a reuseScope of “job” have a maxConcurrent greater than “1,” which also implies they must be thread-safe.

In general, specifying a reuseScope of “job” on a component does not mean that every execution of that component in the job will use the same class instance. In the distributed SIMULIA Execution Engine execution environment, the component may be dispatched to different SIMULIA Execution Engine stations for execution. In this case each station will have a (dedicated) instance for the job. Unless some other means such as affinities are used to control the dispatching locations, the component cannot assume that a single instance will process every execution within a job.

Job level scoping reserves an instance of the component execution class for the job. If the same component (type) is used in multiple places in the model, the same instance will be used for all of them. For example, the same component can be used at several levels of the model or in multiple places in the simulation process flow. With job-level persistence, the component execution instance will be used for all of them because they run as part of the same job. If each of the unique usages of the component within the model needs to have a separate instance, use the reuseScope setting of “job-comppath” as described in Job-Component Path Persistence.

Job-Component Path Persistence

The job level scoping described in the prior section will use a single instance to execute all uses of a particular component type within a model. For example, the following model has three usages of the Calculator type component named C1, C2, and C3.

If the Calculator component had a reuseScope of “job,” a single instance would be used whenever C1, C2, or C3 were to be executed (assuming they were all dispatched to the same SIMULIA Execution Engine station).

If you want each unique use of the component in the model to have its own instance, a reuseScope of “job-comppath” can be specified. The scope of the component instance is a unique component path within the model. Each use of the Calculator component in the example above has a unique path from the root (e.g,. C1 has a path of “Task1.C1,” component C2 has a path of “Task1.DOE.C2,” etc.).

Similar to job level persistence, each instance will be reserved for the job and the unique component path within that job.

A maxConcurrent value of “1” must never be used with a reuseScope of “job-comppath” because a model such as that shown above would logically deadlock. The execution of C1 would reserve a component instance; and when it came time to execute C2, the system would be unable to create another component instance because of the maxConcurrent limit. The execution of C2 would wait indefinitely for the component instance to become available. It cannot become available until the job ends, so the execution of C2 would be blocked indefinitely.

Component Cancel and Timeout

It is possible that the system might need to stop the execution of a component before the lifecycle methods have completed. For example, the user might have chosen to cancel the job in which the component is executing, or the component might have exceeded its maximum execution time (as configured in the model). The component might also be stopped for other reasons.

The system uses several mechanisms when attempting to stop an executing component. In general, the component is inside the execute() method when the system attempts to stop it. The Java language does not provide a reliable means to interrupt arbitrarily or to stop cleanly a thread of execution; there is no way to force the component execute() method to return.

When the system has determined that a component execution needs to be stopped, it takes the following steps:

If the component has established a runtime message listener by calling the StationEnv.addRuntimeMessageListener() method, the listener is called with a RuntimeMessage object. The type of the object indicates the reason for the stop request. Currently, it will be either a RuntimeMessageJobCancelled object or a RuntimeMessageTimeout object.

This method serves as notification to the component that the system has requested that the component execution be stopped. Actions taken in response to this message depend on the nature and design of the component. Most components will not need to respond to this message; they can depend on the default interrupt behavior described below.

If a component does implement a runtime message listener, the listener method is called on an independent thread. The component is responsible for any thread synchronization that might be required.
The JVM thread executing the component—e.g., the thread that is executing inside the component execute() method—is flagged for interruption by the Thread.interrupt() method.

In the Java language this does not cause an immediate interruption or halt execution of the thread. If the thread is blocked waiting for I/O or an Object.wait(), immediate action be taken, and the operation will fail with an InterruptedException. If the thread is not blocked on I/O or waiting, it will continue execution indefinitely. Thus, if the execute() method is stuck in an infinite loop or ignores the InterruptedException, it will not generally respond to the interrupt request.

This mechanism represents a “best effort” on the part of the system to end the execute() method gracefully—e.g., either the method returns or throws an exception; either is okay. Depending on the design of the component execute() method and what it is doing at the time of the interrupt, it might or might not be effective at causing the execute() method to return or throw an exception.

If the component has the Boolean property “InterruptToStop” with a value of FALSE, this step will be skipped and the thread will not be interrupted. Some components cannot safely handle interrupts. However, the ability of the system to cleanly stop the component execution and ensure proper cleanup of resources and execution of the lifecycle methods will be limited.
The system will then wait 10 seconds for the component to either return from the execute() method or throw an exception from the execute() method.
If the component execute() method has returned or thrown an exception after 10 seconds, typical component lifecycle processing continues as usual and no further action is taken to stop it.
Otherwise, the system takes the following steps:
1. The component execution is considered to have failed. The workitem associated with this component execution is marked as “failed” and the job’s simulation process flow continues as it usually would for a failed workitem.
2. An ERROR record is written to the job log noting the failure and the reason the component was stopped.
3. A WARNING record is written to the station log (SIMULIA Execution Engine environment) or Gateway log (Isight) noting that the component did not respond to the stop request.
4. The component class instance is marked as “discarded.” This component class instance will never be used again, even if the execute() method does eventually return or throw an exception. The system removes all references to this class instance, and it may later be garbage collected by the JVM. Only if the execute() method returns or throws will the remaining lifecycle methods be executed, including destroy(). Thus, a hung execute() method can result in destroy() never being called.
5. The next time the system needs to execute this component, a new instance will be created and the usual full lifecycle methods will be called on the new instance. Even if the component has a maxConcurrent value of “1,” the system will still create another instance even though there is a prior instance.

It is important to try to design component execution code to be responsive to system stop requests. Either the execute() method should return or throw when interrupted, or that the component should cause it to return or throw when it receives a stop message on the runtime message listener interface.

If a process component is blocked waiting for some system event—e.g., waiting for a subflow to complete through the RuntimeEnv.getExecutor().waitForSubflow() method—the system will detect the interrupt and immediately throw an appropriate exception from that method. Generally, the component should allow that exception to be thrown out of the execute() method, thus effectively ending execution. See the example execute() method shown in Method: execute().

If an execute() method contains a loop that may run for a long time or has risk of becoming infinite, the component should explicitly check the thread interrupt status each time through the loop (or more often as appropriate). The goal is for the component to detect and respond to the interrupt within 10 seconds. For example:

public void execute(RuntimeEnv runEnv) throws RtException {
   try {
       // This might loop for a long time...
      for (...) {

         // Do some work, less than 10 seconds long

         if (Thread.interrupted()) {
            throw new RtInterruptedException("My Component was interrupted.");
         }
      }
   } 
   catch (RtException rte) {
      throw rte;  // No need to wrap RtExceptions
   }
   catch (Throwable t) {
      throw new RtException(t, "My component failed to execute.");
   }
}

Components which use Thread.sleep() do not in general, need to do anything special to handle interrupts. The sleep() method will throw an InterruptedException if the thread is interrupted. The component should be written to propagate that exception out of the execute() method as an RtException (the last CATCH clause above, would do this, for example).

Runtime Message Listeners

A component may need to be notified of important events even while it is executing (e.g., even while inside the execute() method). The system supplies a means for a component to register itself as a “runtime message” listener—that is, it can request that the system deliver important event messages to a special method on the component class.

To do this, the component class must implement the RuntimeMessageListener interface. This interface has a single method with the signature:

public void runtimeMessageReceived(RuntimeMessage msg) throws 
RtException;

The component must then register itself as a listener for runtime messages by calling the StationEnv.addRuntimeMessageListener() method. This needs to be done only once for the lifetime of the class instance, so it is suggested that this be done in the initialize() lifecycle method. Likewise, the destroy() method should un-register by calling the StationEnv.removeRuntimeMessageListener() method.

Unlike the component lifecycle methods, this method is called on an asynchronous thread. It may be called at any time, including during the time that another thread is executing some other method on the component, such as the execute() or destroy() methods. The implementation of this method must perform appropriate thread synchronization to avoid interference with other threads.

It is also possible that this method will be called from multiple threads at the same time to deliver multiple messages. The implementation must ensure proper thread synchronization in all cases.

When the system has a message to deliver to the component, it will call this method with a RuntimeMessage argument that indicates the nature of the message. The type of message can be determined by the getType() method on the RuntimeMessage argument. Each message type is implemented as a subclass of RuntimeMessage and has methods specific to its type. A component can receive the following types of runtime messages:

Type RuntimeMessage.TYPE_JOBCANCELLED: This message type is sent to a component when the job in which it is running has been cancelled. The component should respond by terminating its execute() method. For more information, see Component Cancel and Timeout. A component receives this message only if it is actively executing as part of the cancelled job. The message object will be of the RuntimeMessageJobCanceled type.
Type RuntimeMessage.TYPE_TIMEOUT: This message type is sent to a component when it has exceeded its maximum allowed execution time. The component should respond by terminating its execute() method. For more information, see Component Cancel and Timeout. The message object will be of the RuntimeMessageTimeout type.
Type RuntimeMessage.TYPE_JOEENDED: This message type is sent to all component instances whether or not they participated in the execution of the job. The notification will be delivered some time after the job has completed execution. The message object will be of the RuntimeMessageJobEnded type.

This method is designed to throw RtException types to indicate some failure to process the message. If this method throws an exception, the component execution will be stopped and the associated workitem will be considered to have failed. The exception will be written as an ERROR in the job log.

Subflow Result Listeners

Process-type components submit requests to the system to run the subflow below them in the model tree using the RuntimeEnv.getExecutor().runSubflow() method. Such components can make multiple requests without waiting for the first one to complete before requesting another. A process component that is searching about a design point might request a series of subflow executions to be run in parallel (e.g., they are all submitted without waiting for any of them to complete).

The component can then wait for all the subflows to complete by using the RuntimeEnv.getExecutor().waitForSubflow() method. However, this method waits for the slowest subflow to complete and does not return until all subflows have completed. Some process component algorithms may benefit from examining the results from each subflow as it completes, and not waiting until they are all done. Typically, the component does not know which subflow will complete first.

The component can register itself as a “listener” for subflow completion event messages and implement the SubflowListener interface so that it can request that subflow results be delivered to the component as soon as each one completes. The component can then examine results as they become available without waiting for the slowest subflow to complete. This interface has a single method with the signature

void subflowCompleted(SubflowResults results) throws RtException;

The component must also register itself as a listener by making a call to the RuntimeEnv.getExecutor().addSubflowListener() method. The call is made for each execution cycle of the component and is typically done in the preExecute() lifecycle method or as the first thing in the execute() method. The component should un-register in the postExecute() method by calling RuntimeEnv.getExecutor().removeSubflowListener().

This method will then be called once for each subflow after the subflow completes. Subflows often complete in a different order from how they were submitted. The SubflowResults object contains information about the status of the subflow (success/failure) and, if it was successful, the resulting parameter values.

Unlike the component lifecycle methods, this method is called on a separate asynchronous thread. The component developer is responsible for synchronizing any data shared with the thread running the execute() method. Typically, this method will be called while the execute() method is running on the component execution thread.

Each event, or call to this method, can may be made on a different thread; but the system guarantees that this method will only be called serially. For example, only one thread will be in this method at a time, but each call to this method may be from a different thread (i.e., this method should make no assumptions about the identity of the thread which calls it).

This listener method is free to use nonthread safe data structures as long as those structures are not also accessed from the execute() method. Any shared structures must be synchronized explicitly by the component code.

The component can throw an RtException from this method to signal an unexpected error condition. The component execution will be stopped, the component execution will be considered as “failed,” and the exception will be reported as the cause of the failure in the job log.

The following sample Component submits a set of subflows to execute and then processes each subflow as it completes by writing a message to the job log.

public class MyComponent
   extends AbstractComponent implements SubflowListener {
   private RuntimeEnv env; // For event method to use
   public void execute(RuntimeEnv runEnv) throws RtException {
      try {
         env = runEnv; // Make available to event method
      
         // Run 10 subflows in parallel
         for (int i=0; i<10; i++) {
            Context ctx = 
            runEnv.getExecutor().createSubflowContext();
            //...modify the subflow parameter values...
            runEnv.getExecutor().runSubflow(ctx);
         }

         // Wait for all of them to complete before returning
         runEnv.getExecutor().waitForSubflow();
   }
   catch (Throwable e) {
      throw new RtException(e, "Failed to execute my 
      component.");
   }
   finally { // Always cleanup class field object references
       env = null;
   }
}
public void subflowCompleted(SubflowResults results) throws 
   RtException {
      if (results.getCc() == PSEUtils.WORK_CC_OK) {   
         env.getJobLog().logWarn("Run number 
         "+results.getRunNumber()+" completed OK.");
      }
      else {
      env.getJobLog().logWarn("Run number 
      "+results.getRunNumber()+" failed.");
      }
   }
}

The system guarantees that any thread blocked on waitForSubflow() will not be released until all subflows have finished and have been fully processed by the subflowCompleted() event method. Thus, it is safe to set the env field to null in the execute() method after the waitForSubflows() call returns. The subflowCompleted() method that uses the env field will never be called after the wait method returns.