Object instantiation and heap size in ColdFusion - Part IV

Digging further on my Java heap issue, I decided to try to figure out what was causing these objects to remain in memory during the life of the request. I modified my code so that all the beans were instantiated in the request scope. I still maintained a function in the main cfm template to set properties of the object in the XML document, and this function accepted an object as an argument. I thought this might have some effect on the ability of CF to garbage collect the objects, but I decided to see what would happen anyway. My test showed a clear trend toward a larger heap size when indexing 10,000 documents, although the heap in this test maxed out at 158 MB and seemed to have some ability to clear memory along the way.

Next, I modified the inline function so that it used the object bean in the request scope rather than accepting a bean as an argument. Under this scenario, the beans would all be instantiated in the request scope and used in the request scope, all within a single cfm page. Would that be enough to allow gc to remove the objects from memory? Under this scenario, the heap maxed out at 140 MB, a bit lower than previously, but perhaps not significant.

After that, I modified the XML routines so that the XML documents were also created and used in the request scope and never passed from one place to another. At this point, I was now using component, but not passing any complex datatypes from place to place, instead simply referencing them in the request scope. This time around, the heap peaked at 137 MB. Moving from unscoped to request scoped variables and eliminating passing complex objects between components seems to have had a limited impact on the amount of memory used in the heap during the indexing process.

As an aide, in these later cases, I made it a point to click the Run GC button in the CF Server Monitor right after the request finished executing. In all cases, the heap dropped off to roughly 20 MB. That means that these requests used anywhere from 40 to 180 MB of memory in the heap to process the same data against the same indexing engine.

It would seem that the use of components as beans to hold data creates a serious potential performance problem for CF-based apps in cases where those beans are persisted (e.g. in the application or session scope) or where many beans are instantiated in a single, long-running request. I don't see any problem using beans in other scenarios, e.g. using beans to store a small number of records returned from the database, or using a bean to model a form in an HTML or Flex front end. I would be cautious, though, of using a large number of beans in a request or persisting beans in the session scope where they may not be garbage collected.

Potential Solutions

Adobe needs to figure out a way for garbage collection of components to be handled more efficiently. In my case, the objects sitting in memory have no further value after being used once in a loop, and they should be subject to garbage collection. My suggestion would be to implement NULLs for variables in CF, perhaps using javacast notation to allow developers to specifically mark objects for deletion.

I gave this technique a shot in my last code iteration, assigning the bean variable to javacast("null",""). (The CF docs specifically warn against  doing this, saying that unpredictable results will occur). My test demonstrated that, rather than eliminating the beans from the heap, the use of this function caused the heap to skyrocket to over 140 MB within the first 2,000 documents.

My suggestion is for Adobe to implement NULLs and allow developers to explicitly mark objects for deletion. Anyone else have a suggestion? Let's hear it. This is an issue that needs to be resolved.

Comments
Sean Corfield's Gravatar Expecting CFers to manage memory like that is just asking for trouble. You don't show any code in your four articles so it's really impossible to tell from your description exactly what it is that causes the ramp up in memory.

I do agree that converting a query into an array of beans - which I think is part of what you are getting at - is just plain dumb and anyone who does that deserves any performance problems they get.

CF can't protect you from all kinds of stupidity and there are lots of ways to write poorly performing code :)
# Posted By Sean Corfield | 1/20/08 9:25 PM
Rob Munn's Gravatar Sean, that is an example of what I am getting at. You don't even have to create a bunch of beans. You can create a single bean and repopulate it from an arbitrary source like a set of nodes from an XML document. Without a way to mark an instantiated CFC for garbage collection, loading enough beans - or even re-using a bean with new data - will eventually fill the heap and knock over the server. It's an edge case, to be sure, but I still feel that the underlying issue of not being able to mark an object for gc should be addressed.
# Posted By Rob Munn | 1/20/08 10:16 PM
Rob Munn's Gravatar Also, I realize I neglected to post any code, my bad. I am going to post an addendum with a bunch of code examples to demonstrate the issue.

I don't expect the average CF developer to need a feature like this, but it would be great to have it.
# Posted By Rob Munn | 1/20/08 10:19 PM
Chris Herdt's Gravatar I'm glad I found this series of entries--I've been grappling with some serious performance degradation in an application I inherited, and I think you've basically backed up my hunches about it. Other than programmatic/improved garbage collection in ColdFusion, I'm sure there are ways (as Sean suggests) of avoiding bean overuse.

Do you have any recommendations on when, and when not to, use beans?
# Posted By Chris Herdt | 3/9/08 10:00 AM
Robert Munn's Gravatar Chris,

Using beans v. not using beans. Based on my experience, I am tempted to say that you should only use components for persistent objects like DAOs and gateways until Adobe fixes the memory issue with shared scope objects, but that isn't a realistic outlook. I use beans in the session scope to hold user information, and I will continue to do so. If you look at Jason Sheedy's blog I believe there is a solution to eliminating session objects when the session ends.

As for using beans to move data to and from the persistence layer, that really depends on your application. The place where I really see problems with CF right now is in long-running requests that generate lots of objects, or that re-use an object iteratively over many iterations. I would avoid using objects in those situations.
# Posted By Robert Munn | 3/15/08 4:41 PM
BlogCFC was created by Raymond Camden. This blog is running version 5.8.001.