I recently added a maintenance form to our website which allows a user to add and delete entries to a list of banner ads stored in a simple XML file. Previously we've just been maintaining them manually by editing the XML directly and copying the associated banner images into a folder on the website. When you delete an ad, I decided to NOT delete the associated image since they may be reused (and I wanted to avoid forcing the user to upload the image again). However, we'd still like to periodically clean out this folder and "archive" the images so they're not cluttering up the selection screen.

I basically needed some code which would get me a list of files in the banner images folder which were not referenced in my banner XML file. It can be kind of clunky to iterate through XML but they've made it much easier with the introduction of LINQ. I fired up LINQPad (which, BTW, is an AWESOME free tool for testing out LINQ code) and tried out a few ideas. As a side note, it looks like Intellisense is now available if you purchase a copy of LINQPad.

http://www.linqpad.net/

I started with querying the filesystem to get a list of files:

  

DirectoryInfo info = new DirectoryInfo(@"X:\inetpub\wwwroot\images\banner");

FileInfo[] files = info.GetFiles();

 

files.Dump();

 

.Dump() is an extension method available in LINQPad which dumps out the results of the query (we haven't actually used LINQ yet to do anything).

Here's what it looks like:

FileInfoLinq

You might notice that Directory contains a DirectyInfo element. If you click on the down arrow it will expand out these values as well.

So I now had a list of files, I wanted to then get a list of images referenced in my banner file. Here's the format of the XML file:

<News>   
   <NewsItem>
       <Title>Supertooth3-banner.gif</Title>
       <Image>/images/banner/Supertooth3-banner.gif</Image>
       <Height>183</Height>
       <Link>/PortalView.aspx?navto=/Supertooth3-banner.gif</Link>
       <Date>July, 19th, 2003</Date>
       <Target></Target>
   </NewsItem>   

  

 

I haven't really been using Title for anything besides the name of the image file, so I was able to take a bit of a shortcut here and use it for my comparison. To pull out the list of images used in the XML, I wrote this query:

XDocument doc = System.Xml.Linq.XDocument.Load(@"X:\inetpub\wwwroot\adv.xml");

doc.Dump();

 

var news = from item in doc.Descendants("Title")

           orderby item.Value

           select item.Value;

news.Dump();

 

Which produces this:

XmlTitlesLinq

I noticed that the banner images folder contained a bunch of other files that I really didn't want to consider for filtering, so I needed to narrow my query to files that had specific extensions. C# doesn't really have a direct equivalent for INLIST, but we can do this a slightly different way for the same effect.

First I define an array of valid extensions, then (inside the where clause of the LINQ query) I check to see if this list of file types contains the filetype of the file I'm currently evaluating. It's a bit backwards, but it's simple and it works.

string[] fileTypes = { ".jpg", ".gif", ".png" };

var imgFiles = from file in files

               where (fileTypes.Contains(file.Extension))

               orderby file.Name

               select file.Name;

 

imgFiles.Dump();

 

Now I've got a list of files from my XML and a list of files from the banner images folder. I want to get a list of files from the banner images folder that aren't in the XML list. I do this via a final query:

var extra = from singleFile in imgFiles

            where !(news.Contains(singleFile))

            select singleFile;

extra.Dump();   

This returns my extra files. Now I can just use this list to move my images into an archive folder periodically.

If I put it all together, I end up with this:

 

XDocument doc = System.Xml.Linq.XDocument.Load(@"X:\inetpub\wwwroot\adv.xml");

DirectoryInfo info = new DirectoryInfo(@"X:\inetpub\wwwroot\images\banner");

FileInfo[] files = info.GetFiles();

doc.Dump();

files.Dump();

 

var news = from item in doc.Descendants("Title")

           orderby item.Value

           select item.Value;

news.Dump();

string[] fileTypes = { ".jpg", ".gif", ".png" };

var imgFiles = from file in files

               where (fileTypes.Contains(file.Extension))

               orderby file.Name

               select file.Name;

 

imgFiles.Dump();

 

var extra = from singleFile in imgFiles

            where !(news.Contains(singleFile))

            select singleFile;

extra.Dump();   

Links:

http://www.linqpad.net/


 
Wednesday, August 12, 2009 8:13:29 AM (Eastern Standard Time, UTC-05:00)
This is really cool stuff, thanks for posting this. I just started using LINQPad and bought the auto-completion license because I like it so much. Nice blog BTW, I'm adding it to my favorites.
Andrew
Name
E-mail
(will show your gravatar icon)
Home page

Comment (Some html is allowed: a@href@title, b, i, strike) where the @ means "attribute." For example, you can use <a href="" title=""> or <blockquote cite="Scott">.  

Enter the code shown (prevents robots):

Live Comment Preview