Daemon Divertimento: Creating Background Processes in ColdFusion
I've been always fascinated by the idea of software daemons, like tiny green gnomes lurking in the insides of the computer just waiting to quickly and silently take care of multiple tasks. When ColdFusion 8 came out with the new CFThread tag, for some reason one of my first thoughts was if I could use this tag somehow to create my own ColdFusion-based "daemons". I wondered what would happen it you created a thread and fired it off on an endless loop, always checking for some condition or some occurrence to act on. Of course, after having developed ColdFusion pages for so long, the idea of an infinite loop in CF only brought images of chaos and servers crashing to my mind; however the curious and mischievous mind in me prompted me to just try it out and "see what happens".
So I quickly developed a basic proof of concept to see what would happen if I created a thread and just set it off in an infinite loop. Thus, with fear in my soul, I opened the browser and called the template.... And nothing happened. I looked at the server memory, the JVM memory, processor load, everywhere... and nothing... just a little bump for the template being executed. Even the CF Monitor would only report it as a long running template, and nothing more.
Now, my second assumption was that the ColdFusion template timeout would kick in and the thread would be stopped with a timeout error. However to my surprise this didn't happen either. Apparently, threads are not bound by the execution time limits set for regular templates.
At this point I knew two things: first, a constantly running thread would not bring the server down (as in 'no nasty memory leaks') and second, threads could run as long as they wish without anyone bothering them. These two things meant that, at least in theory, one could build constantly running background processes using only standard ColdFusion tags; no need for event gateways or additional Java voodoo.
So I just went ahead and started building a full featured proof of concept to exploit these findings.
Thus I built myself a little logger component. The main idea is that the parts of an application or site that want to log a message would only need to place the desired message on a centrally available queue. At the same time, our background process would be monitoring this queue and whenever anything new is added, then it would just pop out the new message and write it to the log. This way the application would be completely detached from the logic or mechanism used to write the log.
As a side note, is interesting to note that having a continuously running process also opens the possibility to not only react to incoming requests (or function calls) but to proactively take actions. For example, on the logging component, every 15 minutes the component writes a message to the console (cfserver.log) indicating the number of messages logged since the last report.
Here is the code for the component. Look at the comments for detailed explanation of what each part does:
<!--- loggerd.cfc
by Oscar Arevalo (www.oscararevalo.com)
Loggerd.cfc is a proof of concept on how to implement background processes in ColdFusion
**** Feel free to use this in any way you like. ****
--->
<cfset variables.instance = structNew()>
<cfset variables.messageQueue = arrayNew(1)> <!--- main queue to store the messages to be logged --->
<cfset variables.maxQueueSize = 500> <!--- maximum number of messages to allow in the queue --->
<cfset variables.threadActive = false> <!--- this flag is used to control the main loop --->
<cfset variables.threadID = 0> <!--- an identifier for the main loop --->
<cfset variables.debugMode = true> <!--- when true, outputs debugging messages to the consolo (cfserver.log) --->
<cfset variables.logFilePath = ""> <!--- the path where the log file will be located --->
<cffunction name="init" access="public" returntype="loggerd" hint="constructor">
<cfargument name="logFilePath" type="string" required="true" hint="">
<cfargument name="maxQueueSize" type="numeric" required="false" default="#variables.maxQueueSize#" hint="The maximum number of messages that can be stored on the queue at any given time">
<cfset variables.messageQueue = arrayNew(1)>
<cfset variables.maxQueueSize = arguments.maxQueueSize>
<cfset variables.logFilePath = arguments.logFilePath>
<!--- start the main loop --->
<cfset startQueueLoop()>
<cfreturn this>
</cffunction>
<cffunction name="queueMessage" access="public" returntype="boolean"
hint="adds a message to the queue to be logged, returns True if the message was added succesfuly. ">
<cfargument name="message" type="String" required="true">
<cfargument name="severity" type="String" required="false" default="INFO">
<cfscript>
var ret = false;
var st = structNew();
// create a structure to add to the queue
st.timestamp = now();
st.message = arguments.message;
st.severity = arguments.severity;
// only add the message to the queue if the queue has not reached the maximum size
if(arrayLen(variables.messageQueue) lt variables.maxQueueSize) {
arrayAppend(variables.messageQueue, duplicate(st));
ret = true;
}
// note that there is virtually no overhead for adding a message to the queue, the caller of this funcion doesnt // need to wait for any IO or other process to finish before regaining control.
return ret;
</cfscript>
</cffunction>
<cffunction name="setQueueThreadActiveStatus" access="public" returntype="void"
hint="sets the status flag to control the main loop. A value of TRUE will keep the main loop running, or if it is not running, will cause it to start. A value of false will stop the main loop">
<cfargument name="threadActiveStatus" type="boolean" required="true">
<!--- start the queue loop only if it is off --->
<cfif not variables.threadActive and arguments.threadActiveStatus>
<cfset startQueueLoop()>
</cfif>
<cfset variables.threadActive = arguments.threadActiveStatus>
</cffunction>
<cffunction name="getQueueThreadActiveStatus" access="public" returntype="boolean" hint="returns the current status of the main loop. TRUE means that the loop is running and the queue is being processed">
<cfreturn variables.threadActive>
</cffunction>
<!--- PRIVATE METHODS --->
<cffunction name="writeToLog" access="private" returnType="void" hint="writes a message to the log">
<cfargument name="messageStruct" type="struct" required="true" hint="a structure containing the information to log">
<cfset var st = arguments.messageStruct>
<cfset var tsStr = "#lsDateFormat(st.timestamp)# #TimeFormat(st.timestamp,'hh:mm:ss')#">
<cfif not fileExists(variables.logFilePath)>
<cffile action="write" file="#variables.logFilePath#" output="--- Log Created on #tsStr# ---">
</cfif>
<cffile action="append" file="#variables.logFilePath#" output="#tsStr# #st.severity# #st.message#">
</cffunction>
<cffunction name="consoleDump" access="private" returntype="void" hint="dumps a message to the console">
<cfargument name="threadID" type="string" required="true">
<cfargument name="msg" type="string" required="true">
<cfset var msg1 = "[loggerd] " & "[" & arguments.threadID & "] " & timeformat(now(),"hh:mm:ss") & ": " & arguments.msg>
<cfif variables.debugMode>
<cfdump var="#msg1#" output="console">
</cfif>
</cffunction>
<cffunction name="startQueueLoop" access="private" returntype="void" hint="starts the main loop to process queue elements. The loop will continue until the QueueThreadActiveStatus is set to false">
<cfset var msgCount = 0>
<cfset variables.threadID = createUUID()> <!--- assign an ID to the loop --->
<!--- here we create a global reference to the current thread so that anyone can check if there is a thread running or not,
and also the identity of the currently running thread --->
<cfset application.logger_currentInstanceID = variables.threadID>
<cfthread action="run" name="queueLoop">
<cfscript>
consoleDump(variables.threadID, "Starting thread..." );
// set status to active
variables.threadActive = true;
// initialize counter
msgCount = 0;
try {
while(variables.threadActive) {
// loop will keep running until the threadActive flag is set to false
// check if this is the same logger instance, otherwise get out
if(variables.threadID neq application.logger_currentInstanceID)
variables.threadActive = false;
// check if there are any messages to process in the queue
if(arrayLen(variables.messageQueue)) {
// process message
writeToLog( variables.messageQueue[1] );
// remove message from queue
arrayDeleteAt( variables.messageQueue, 1 );
// increment counter
msgCount = msgCount + 1;
consoleDump(variables.threadID, "Message logged and removed from queue.");
}
// check if it is time for the status report
if(timeFormat(now(),"m") mod 15 eq 0 and timeFormat(now(),"s") eq 0 and timeFormat(now(),"l") eq 0) {
consoleDump(variables.threadID, "Number of messages logged since last report: #msgCount#");
msgCount = 0;
}
}
} catch(any e) {
consoleDump(variables.threadID, "Error ocurred during thread loop: #e.message#" );
}
// make sure the threadActive flag is set to false
variables.threadActive = false;
consoleDump(variables.threadID, "Thread finished." );
</cfscript>
</cfthread>
</cffunction>
</cfcomponent>
I'm attaching the sample code I made for the logger component. The zip includes the loggerd.cfc and an index.cfm file to do some testing. Feel free to download and play with it. Of course this is just a proof of concept and is not production quality code. This is just intended to suggest a new avenue for ColdFusion development and ideas that could benefit from this kind of approach.
It would be interesting to see what ideas other developers can come up with in which something like this could apply.
I guess the main reason for doing something like this would be that you might not have access to the CF Administrator to setup the Event Gateways. Also, with event gateways there is only a limited set of different gateways that you can use (without having to go into Java and create your own) so having a pure CF solution could be an option.
1) if enough people on a shared serve used this trick, it could use up all the threads available to CF and CF would die.
2) there is no logging like in a scheduled task unless, like you did, write your own.
3) it looks like, from your code, it could be a PITA to stop the thread. Come to think of it, how do you stop it if it's in an infinate loop?
4) no easy way to list all of the currently running thread on the server, so how do you know how many have spawned?
Yes, this is a cool little thing that you discovered. However I think that cfthread for the most point will go the way of event gateways. Never understood what the big deal was with event gateways when 99% of the time a scheduled task would fit the bill
You are absolutely right, I guess I forgot to put a big sign with big bold red letters saying "Handle With Care", with the bio-hazard sign next to it. Like anytime you deal with multithreading, this has to be done VERY carefully.
To address some of your points:
1. The creation of these threads should be controlled in such a way that only one thread is created. This thread would be the one in charge of doing all the background work.
2. You are right there would be no logging unless you write your own.
3. Being an endless loop there needs to be some sort of exit condition so that it can be stopped. In the particular example of my code, see that the loop is contained within a While() block that looks at the threadActive variable. Being an asynchronous process executing within a persisted component on the Application scope, a calling template can just call the method of the component that sets the threadActive variable to false, thus breaking the main loop.
4. As noted in point 1, the application should not create multiple threads.
remember also that CF can execute JSP. So why not create a real java timer using java.util.Timer and TimerTask. You even have access to the pageContext() from within JSP.
Might be more complicated, but I would be really scared of taking an infinite running cf thread into a production environment.
Said that, man, you are FUNKY!! Great find Oscar!!
-Randy
I'm looking at queuing all request and making sure not more than x of them are running at once. Once one is done, then the next in the queue should start until all are done. Then, something should be checking in to see if requests are in the queue.
In my case, it's creating PDFs using Fop.