Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Gustas Sagaitis 14 posts 93 karma points
    May 01, 2014 @ 16:27
    Gustas Sagaitis
    0

    imagegen large index.xml file size, high CPU and memory usage

    Hi,

    I have about 1000 images in one folder, from 4K to 90K size, 16M of images in total, not much at all.

    Then there is a page that basically loads bunch of those images using imagegen, not all of them but 200 or so. The parameters used are width, altimage and image.

    Then imagegen (w3wp.exe process) starts consuming all available CPU and memory, in less than 30 seconds it will consume 6 GB of ram and then some crashing entire machine. I have observed that it will create very large index.xml file, in my case it managed to create 16gb index file with many duplicate entries.

    Version that I am using is 2.9.1

    I found this thread, seems to be similar problem that I am describing http://our.umbraco.org/projects/website-utilities/imagegen/imagegen-bugs/31334-Imagegen-problem-with-large-image-folder.

    Any ideas of what could be causing it? Is there a fix?

  • Douglas Robar 3570 posts 4711 karma points MVP ∞ admin c-trib
    May 01, 2014 @ 17:33
    Douglas Robar
    101

    Hi, Gustas,

    Thanks for the forum post. Very odd behavior when using ImageGen 2.9.1.

    A 16gb index file will attempt to be put in memory, which will obviously swamp the 6gb of ram on the server, so that's likely the source of the problem. The question is, why is it 16gb? That's an insane size. I wonder if this were an upgrade from an earlier version and the old (and outdated) cache information were not deleted.

    In any event, it is always safe to delete 'cached' folders as any needed images will be recreated and saved when next requested. Granted, the initial generation of resized images will place a burden on the cpu and memory, and .net and ImageGen use available resources to perform the tasks as quickly as possible for website visitors. But once the cache is primed subsequent requests can simply send a 304 response or the already-cached image from the server.

    1. Please delete the 'cached' folder. (this might take some time as Windows can be slow with many small files)
    2. When the cached folder is finally deleted, recycle the application pool to clear anything in memory
    3. Again delete anything in the 'cached' folder that might have been put there if the site were busy while doing the prior two steps
    4. Again, recycle the app pool to clear anything in memory

    Typically you'd only need to do steps 1 & 2 but as you've got a problem it's worth being doubly sure all old cached data is cleared from disk and memory. The index.xml file and resized images should begin appearing in the cached folder upon the next request. The index.xml file should be only a few KB to begin with and grow to some number of MB as the cache is primed. It should never be into GB and if it gets there (especially immediately) then the steps above weren't sufficient to clear the memory of the previous cache data.

    Let us know what you find out.

    cheers,
    doug.

  • Gustas Sagaitis 14 posts 93 karma points
    May 02, 2014 @ 10:08
    Gustas Sagaitis
    0

    Hi Douglas,

    I deleted all "Cached" folders while IIS was not running, started IIS and application and it worked great. Of course the initial impact of warming up cache can be seen but that is expected.

    Note that I tried these things (deleting "Cached", restarting IIS, recycling application pool in various combinations) before creating this thread, even rebooted the machine (couple of times) but that had no impact. Somehow today I cannot reproduce same behaviour.

    Anyway, if issue repeats itself I will post here again.

    I think size of index.xml (that was 16G when I discovered it first time) has no significance here. It would just grow as fast as hard drive could write it until I kill w3wp.exe.

    Thanks for your help Douglas!

  • Douglas Robar 3570 posts 4711 karma points MVP ∞ admin c-trib
    May 02, 2014 @ 10:17
    Douglas Robar
    0

    Hi, Gustas,

    Glad you got it sorted out! Stopping IIS and deleting the 'cached' folder is a great solution and will definitely do the job. And please let me know if you have any problems in the future.

    FWIW, the size of the index.xml file is relevant since ImageGen will have to read the file into memory to find the appropriate entry. At that monstrous size the application pool/ImageGen would consume all the available memory (6GB in your case) and still need more so would start swapping to disk and taking a ton of time and possibly crash the application pool in the process.

    The next version of ImageGen will remove the reliance on an index.xml file completely to save memory and improve performance even further.

    cheers,
    doug.

  • andy_rose 91 posts 117 karma points
    Jun 25, 2014 @ 10:15
    andy_rose
    0

    Hi,

    Just want to add that we are experiencing a similar issue with ImageGen Pro v2.9.1.36630 but only on one server in our web farm.

    We are using ImageGen to serve resized images to a CDN which is requesting these from two servers in our web farm of four servers. They are both running Windows Server 2008 R2 with service pack 1, one is cloud based and the other is physical. The physical server is the one that is having issues in that the CPU and memory usage spikes periodically causing the server to fail. Just to note that the physical server has more cores and memory that the cloud server.

    When checking the server during this behaviour I searched for index.xml files and we had one which has reached over 1Gb. I managed to copy it and wrote a small console app to load the xml which revealed the xml was broken (missing closing tags). I fixed this in gVim and queried the xml again and it revealed that we had over a 1000 entries for each cached image so i'm guessing the building of this file had managed to get into some kind of loop until it failed. I've now noticed that this index.xml file has now disappeared and been replaced by a new one which doesn't have a reference to all the cached images that were previously created which leads me to suspect we have a lot of orphaned images taking up disk space which won't be used.

    Any ideas what the underlying cause could be? I've seen that Server 2008 R2 has been a problem before but I thought that was now fixed. Also I see that the index.xml files are being removed in a future release, any idea when this may be? Final question, should I delete the Cached folder so remove the orphaned images or is this handled in ImageGen.

    Sorry for the wall of text, just wanted to give as much info as possible. Really liking ImageGen and hopefully this can be resolved quickly.

    Andy

  • Gustas Sagaitis 14 posts 93 karma points
    Jun 25, 2014 @ 10:44
    Gustas Sagaitis
    0

    Very interesting. I haven't had the problem since I posted last time.

    My best guess for this intermittent issue is that it is some sort of race condition that cannot be reproduced easily.

  • Katie 25 posts 90 karma points
    Jun 27, 2014 @ 10:46
    Katie
    0

    I have also had the same problem, except my index.xml file ended up 180GB in size, and crashed my whole server. And yesterday it occured agin creating a 83.6GB index.xml file.

    I have tried your resolution above, which resolves the problem short term, but this is now occuring every couple of days as the website becomes busier.

    I am using  ImageGen v2.9.1.36630 on Windows Server 2008 R2 with service pack 1.

    Please fix the problem, ImageGen is brilliant and very useful.

  • Douglas Robar 3570 posts 4711 karma points MVP ∞ admin c-trib
    Jun 27, 2014 @ 11:01
    Douglas Robar
    0

    Hi Andy,

    It's always safe to delete a cached folder. Images will be recreated as needed.

    You're correct that as a last resort, if the xml file becomes irretrievably corrupt that the xml file is deleted and any previously-cached images can become abandoned. This was particularly the case with version 2.5 and earlier. With the latest versions, however, cache filenames are deterministic based on a hash of the original filename and parameters requested. Thus, if a request comes in matching an already-cached file that exists on disk but isn't in the xml the file, it will be reused. Because of this, truly abandoned cached files would only occur if a previously requested combination of image and parameters is never again requested and you can freely delete any old images in a cached folder, even without updating the xml file accordingly. While leaving a bit of overhead in the xml file there is always a check that the cached file exists before attempting to send it to the visitor. If the cached file has been deleted it will be recreated first.

    As for the reasons for an enormous (and possibly corrupt) xml file, there are a few reasons this can happen, particularly in load balanced environments. What needs to be kept in mind is that each web server running ImageGen is responsible for its own cache. Two servers must not attempt to share the cached folders on a shared drive. Most especially, not the index.xml file itself. To do so would quickly cause contention, performance problems, and an ever-growing xml file and memory usage. We're working on alleviating this problem for easier use in load-balanced environments but as yet don't have a released solution. Until then, don't replicate cached folders.

    cheers,
    doug.

  • Alex Campbell 47 posts 69 karma points
    Aug 26, 2020 @ 14:20
    Alex Campbell
    0

    Sorry to pull up an old thread but we're having this problem using a Windows server accessing Media from an Azure blog storage. The site behaves fine until we get loads of duplicate entries in the index.xml file, it grows to about 2-4Gb and then everything grinds to a halt.

    As Doug didn't manage to get v3 out, did anyone find an answer to this issue?

    Thanks,

    Alex

Please Sign in or register to post replies

Write your reply to:

Draft