I ran across a really cool .NET library on a recent project I've been working on. We have an internal website where we post news, documentation, etc. - basically a Content Management System (CMS). We're working on a new set of documentation that is being done inside of a third party help builder application. We need to import the HTML files it generates into our website (so we get all the things it offers, like security, searching, revision tracking, view statistics, etc.). So basically, I need to run through a lot of HTML files, build a tree of the documents (similar to the help file) and rewrite all of the URL's and image links to point to the correct URL inside of the site. I initially started looking at various regular expressions that I might be able to use over at http://regexlib.com/. Almost every single one of them had some comment about it failing under some circumstances. The HTML is surprisingly clean, but I was still nervous about it. So I looked at using GOLD to parse the HTML. However, from some of the comments I found it still didn't make everything as easy I would have liked. I finally ran across HtmlAgilityPack over on CodePlex . It's a .NET library which lets you read AND write changes to an HTML file via a simple API. Here's a chunk of code from my importer so you can get a feel for how it works: HtmlDocument doc = new HtmlDocument(); doc.Load(content.FullDocumentPath); HtmlNodeCollection linkNodes = doc.DocumentNode.SelectNodes("//a/@href"); Content match = null; // Run only if there are links in the document. if (linkNodes != null) { // Fix up the URL's foreach (HtmlNode linkNode in linkNodes) { HtmlAttribute attrib = linkNode.Attributes["href"]; // If it's an internal page anchor, ignore it if (attrib.Value.StartsWith("#")) continue; string path = this.GetAbsolutePath(content.DocumentLink, attrib.Value); match = this.m_contentList.Find(p => p.DocumentLink == path); if (match != null) attrib.Value = match.GetUrl(); else if (!path.ToLower().StartsWith("http://") && !path.ToLower().StartsWith("mailto:")) Console.WriteLine("Cannot find matching document, searched for " + path); } } Basically, doc.DocumentNode.SelectNodes("//a/@href") returns a collection of links in the document (it uses XPath syntax for the selection string). From there, I just iterate through them, build the new URL, then save the modified Url via code that just does: linkNode.Attributes["href"].Value = "New URL Here". I also needed to strip out all the script tags inside of the document, so it uses similar syntax: private void StripOutScripts(HtmlDocument doc) { // Strip out the scripts HtmlNodeCollection scriptNodes = doc.DocumentNode.SelectNodes("//script"); if (scriptNodes != null) { foreach (HtmlNode scriptNode in scriptNodes) { scriptNode.ParentNode.RemoveChild(scriptNode, false); } } } I do the same sort of thing - iterate over the collection, except this time tell it to remove the nodes from the document (note that I'm grabbing the parent node, since the current node is everything contained within the script, excluding the <script> tags. By getting the parent, we get that and the tags themselves. Each collection has a WriteContentTo() method which can write the HTML for that section of the document to a Stream. What's really nice about this entire library (besides how simple it was to use) was the fact that it doesn't seem to mangle the existing HTML when using WriteContentTo() (at least from what I've seen). Only one minor complaint - the docs are a bit weak. It just includes the standard documentation of the classes, not much in the way of examples. However, it's pretty consistent so it doesn't take much to get started with it. What a great library - it couldn't be simpler. It saved me a ton of time. Links: http://www.codeplex.com/htmlagilitypack http://regexlib.com/ http://www.devincook.com/goldparser/
This was an interesting issue I ran into today. We've got a utility I wrote which can search Outlook MSG files from a desktop interface. As part of the program you can print the results to a PDF - I'm using Crystal Reports to do the printing. We had this application installed on a desktop machine for a while but decided to move it over to one of our servers which had a lot more hard drive space. After moving it, when we attempted to print we'd get the error: "An error has occurred while attempting to load the Crystal Reports runtime" At first I thought it might be related to needing the CR DLL's installed instead of just being deployed in the app's directory. We installed them and tried again - same exception. Further down in the error (which was actually helpful - imagine that!), "Please install the appropriate Crystal Reports redistributable (snip) containing the correct version of the Crystal Reports runtime (x86, x64, or Itanium). Hmm...x86 vs x64 - that has to be it - this was Windows 2003 Server (64 bit) vs. XP. I actually just recently listened to a DotNetRocks podcast which talked about .NET applications running under a 64 bit OS. They mentioned that, by default, most .NET applications are compiled under "Any CPU". That means the code get's JIT'ed to 64 bit code under a 64 bit OS - sounds OK. The only catch is that all the components must also be compiled the same way, otherwise you run into problems. I didn't really want to have two different sets of DLL's so I went back into my application and changed it from "Any CPU" to x86 code and recompiled. Order was restored to the universe. Links: http://www.dotnetrocks.com
I mentioned Dependency Injection / Inversion of Control (DI/IoC) recently and I really didn't explain what it is, why you might want to use this particular pattern, and why on earth you'd need a framework for it. It's a fancy name for a fairly simple concept. Instead of creating objects inside of your classes, you let the calling code "inject" the necessary instances into your code. It's probably easiest to see this in some code. I'm going to show both C# and VFP code, since I don't want you to get the idea that this is a .NET-only type thing. 1 public class SampleDependency 2 { 3 public string SayHello() 4 { 5 return "Hello"; 6 } 7 } 8 9 public class Sample 10 { 11 protected SampleDependency m_depend; 12 13 public SampleDependency Depend 14 { 15 set { this.m_depend = value; } 16 } 17 18 public Sample(SampleDependency depend) 19 { 20 m_depend = depend; 21 } 22 } Sample sample = new Sample(new SampleDependency()); VFP VersionDEFINE CLASS SampleDependency AS Session
FUNCTION SayHello()
RETURN "Hello"
ENDFUNC
ENDDEFINE
DEFINE CLASS Sample AS Session
oDepend = NULL
FUNCTION Init(toDepend)
This.oDepend = toDepend
ENDFUNC
ENDDEFINE
loSample = CREATEOBJECT("Sample", CREATEOBJECT("SampleDependency"))
Notice that in both cases, we are passing in the instance we want the class to use instead of letting the class create the instance itself. That's all DI/IoC is. Honest, that's it. This is DI via a constructor (you can also do it via a property setting instead; notice in the sample C# code I created a write-only property which could hold the reference).
So the next obvious question is, why? What's wrong with just creating the object inside of the class?
One thing DI gives you is the ability to easily swap in different objects. In the C# example, we probably would change the parameter from a specific type to an interface. Now any class which implements that interface can be injected into this class. In VFP, since it's not strongly typed, you can just pass in whatever instance you'd like (it's up to you to make sure it doesn't blow up at runtime by accessing some method or property which isn't on the passed in object). My initial thought after seeing this was, well, can't I just use an abstract factory pattern instead? In an abstract factory you delegate object creation to a "object factory" - usually passing in a name or calling a method which returns the actual instance you'd like to use. This sounds like almost the same thing.
An abstract factory does let you do that, but it doesn't let you easily do something the DI/IoC pattern does: test your objects. Let's suppose you want to write a test for a class which uses another class to send out an e-mail. You aren't really trying to test sending an e-mail - that's just one of the things the class you're testing happens to do during some process. In fact, you really don't want to send out an e-mail; we don't want to spam our users. If you happen to use the abstract factory pattern, you would need to modify it to create your dummy/stub/mock object for sending an e-mail (in VFP that's most likely by editing a record in a table, but the idea is the same), then test the object in question. If you used the DI/IoC pattern the only thing you need to do is pass in your dummy/stub/mock object. No other modifications are necessary.
OK, so this all looks simple enough. Why would you need a framework for the above?! One of the biggest reasons - less typing. You'll notice that in order to use any of these objects you may have to pass in a bunch of other dependency objects (which themselves may have other dependencies). For a complex set of objects that could really suck. A DI framework does that for you along with the benefits of an abstract factory, all rolled up into one. In your code you call the DI framework and tell it to get you an instance of a class - it figures out what objects need to be passed in for you so you don't need to do it. In your tests, you can instanciate the objects directly and pass in your stub/dummy/mock objects instead.
Links:
http://weblogs.asp.net/rosherove/archive/2007/09/16/mocks-and-stubs-the-difference-is-in-the-flow-of-information.aspx http://www.hanselman.com/blog/ListOfNETDependencyInjectionContainersIOC.aspx
I attended a Day of Dot Net event in Lansing a few week back. If you're not familiar with them, they are free mini-conferences (one day) about, not surprisingly, .NET. I had originally planned on driving out to it Saturday morning, but then Jenn pointed out I'd have to get up really early to get there around 8am. I already work in Farmington, which is 45 minutes to 1 hour from home (and 45 minutes to 1 hour closer to Lansing) and would end up being a really long day for me. So I ended up just staying at a hotel in Lansing Friday night. That turned out to be a great idea. Note to self: the Best Western in Lansing feels and smells like a 80's style bowling alley. When I got to the hotel, I had some time to look up directions to the college where it was being held. It was only a few miles away, so I decided to not bother to drive over there on Friday night (which is what I would normally have done). In the morning I followed the directions Google Maps had given and found myself in a church parking lot (hmm..."Day of Dot Net and Evangelical Revival??"). I checked the map a few times and it looked OK, and I was exactly where it said I should be. I think that was about the point where I starting cursing out Google maps. I had left all my information about the conference in the trunk so I had to get out of the car to get at it. I happened to notice that the road I was on looked like it actually continued around the side of the church (imagine a light bulb going off above my head: "hey...maybe...."). I jumped back in the car and drove around the parking lot and sure enough, the trees suddenly cleared on my left hand side where the college was hiding. I noticed another car stop right about where I stopped, so it wasn't just me being dense (honest!). Note to organizers - great event, but a small sign would have been appreciated. When I got into the building (which also wasn't marked, so I still wasn't entirely sure I was at the right entrance), I was surprised at how few people were there - that kind of surprised me since I wasn't able to make it to the last DoDN because it had filled up. It turns out that as the morning wore on the sessions really started to fill up. The sessions were an hour long - which is REALLY short; they flew by. The sessions all seemed to run a few minutes long which pushed into the next session running a bit longer. The lunch break helped to reset everything. A few notes to the various presenters: - The bottom 1/3 of the screen really isn't visible if you're sitting in the back of the room. I was sitting in the second row and couldn't read some of it.
- White text on a black background might be easier on your eyes for development, but it's impossible to read when it's projected up on a screen. I'd suggest sticking with black text on a white background.
- Don't try to wing demo's. Only a few people can successfully pull that off - you're probably not one of them.
I actually ran into a few people I knew - one was someone from the local VFP user group, the other was a old-VFP developer that I haven't seen in a few years. That was a nice surprise. Overall, I was impressed by the number of people who attended, considering you're basically giving up a weekend day to attend. It's nice to see that some people actually care about getting better as developers (either that or they, like me, needed a few new shirts for their wardrobe). One session happened to stand out in my mind - a session about Dependency Injection / Inversion of Control (specifically, the Windsor framework) by Jay Wren. Well organized, hit every question I had about DI/IoC. Honestly, I didn't "get" DI/IoC before this session; yeah, I understood what it was, but not really why on earth I might need a framework for it. It is actually an elegant way of solving a particular development problem, giving you the benefits of a factory pattern and the flexibility of DI, without getting in your way (at least that's what my notes say). I had the "a ha!" moment, then promptly lost it in one of the other sessions. I'm sure it will come to me at some point, although at this point I'm getting a bit nervous  At the end of the conference, they ended up giving away of ton of stuff. Just not to me. Oh well, maybe next time. Overall, I definitely attend another one - a big thanks to the organizers and presenters and sponsors, I know it's a lot of work to put something like this on - it was appreciated. Just try to make sure you've got some soft drinks (Coke, Mt. Dew, etc.) available in the morning next time around <g>. It's hard for some of us to get moving in the morning without some caffeine (for us non-coffee drinkers). Sure I feel all healthy from the orange juice I ended up drinking, but it didn't help much to put a spring in my step. Links http://www.dayofdotnet.org/Lansing/2008/ http://jrwren.wrenfam.com/blog/
I've been spending a lot of time lately doing some winter cleaning. We're trying to free up some space in the basement for a play area for Brendan. It's amazing how much stuff you can collect. We've thrown away a LOT of stuff - I'm pretty sure the garbage guys hate us by now. Jenn mentioned that one of them didn't look too pleased when he tried to lift one of the bags we put out. We've been donating anything with think might still be useful, and we have people who drive through our subdivision on garbage days looking for interesting finds. More power to them, I say; I'd rather someone finds some use for this stuff instead of throwing it out. Besides, who's got the patience for a garage sale? And who really wants to deal with people trying to get half price for an item marked $1 that originally cost $50. Since I've been involved with computers for quite some time (and not all of it as a developer), I've managed to collect quite a collection of old computers. Old Pentiums, 486's, a few 386's, motherboards, cases, power supplies, an unbelievable amount of cables, network cards, video cards, etc. I'm planning on posting that stuff on our local freecycle site to see if anyone might be interested in it before tossing it. One of my regrets with a lot of this is that I didn't give it away sooner, while it still may have been of more use to someone. I guess that may have been why I kept it. A big part of this collection is a ton of books and magazines. I've whittled the magazines down to something manageable, but I still have way too many books. I'm sure I'll add more to the list as soon as I can convince myself that I really don't need them anymore, and once I have time to go through the ones still hiding in the basement (and hopefully before some of them aren't useful anymore). Here's a list of what's on the chopping block (you might be surprised; there are some good books here): - Apple II Plus/IIe Troubleshooting & Repair Guide, Robert C. Brenner. Sams. ISBN: 0-672-22353-8
- DNS and BIND 3rd Edition, Paul Albitz & Cricket Liu. O'Reilly. ISBN: 1-56592-512-2
XML Extensible Markup Language (w/CD), Elliotte Rusty Harold. IDG Books. ISBN: 0-7645-3199-9 The Unified Modeling Language User Guide, Booch, Rumbaugh, Jacobson. Addison-Wesley. ISBN: 0-201-57168-4 - The Visual FoxPro 3 Codebook (CD is missing), Yair Aan Griver. Sybex. ISBN: 0-7821-1648-5
Object Orientation in Visual Foxpro, Savannah Brentnall. Addison-Wesley. ISBN: 0-201-47943-5 Object Models: Strategies, Patterns, & Applications (Second Edition), Coad, North, Mayfield. Yourdon Press. ISBN: 0-13-840117-9 - Visual Basic 6 Business Objects, Rockford Lhotka. Wrox. ISBN: 1-861001-07-X
ASP.NET 2.0 Unleashed, Stephen Walther. Sams. ISBN: 0-672-32823-2 Hacker's Guide to Visual FoxPro 6.0, Granor, Roche. Hentzenwerke Publishing. ISBN: 0-96550-936-2 The Inmates Are Running The Asylum, Alan Cooper. Sams. ISBN: 0-672-31649-8 About Face: The Essentials of User Interface Design, Alan Cooper. IDG Books. ISBN: 1-56884-322-4 The Improvement Guide, Langley, Nolan, Nolan, Normal, Provost. Jossey-Bass. ISBN: 0-7879-0257-8 HTML: The Complete Reference (Second Edition), Thomas A. Powell. Osborne. ISBN: 0-07-211977-2 - Effective Techniques for Application Development w/VFP 6.0, Booth, Sawyer. Hentzenwerke. ISBN: 0-96550-937-0
What's New in Visual FoxPro 8.0, Granor, Hennig. Hentzenwerke. ISBN: 1-930919-40-9 CrysDev: A Developer's Guide to Integrating Crystal Reports, Craig Berntson. Hentzenwerke. ISBN: 1-930919-38-7 Advanced Object Oriented Programming w/VFP 6, Egger. Hentzenwerke. ISBN: 0-96550-938-9 Client/Server Applications w/VFP & SQL Server, Urwiler, DeWitt, Ley, Koorhan. Hentzenwerke. ISBN: 1-930919-01-8 C# Unleashed, Joseph Mayo. Sams. ISBN: 0-672-321-22-X Measuring and Managing Performance in Organizations, Robert D. Austin. Dorset House. ISBN: 0-932633-36-6 Windows 2000 Server Resource Kit (No CD), 8 books total
If anyone might be interested in this stuff (books or computer techno-rubble), drop me a line (I can take a pic. of the computer stuff). All of it free as long as you pick up the shipping cost. I hope I don't regret giving away some of this, these books have served me well <g> Links: http://www.freecycle.org
There is a ton of great .NET content available on the web; everything from simple code snippets to full blown apps. I really appreciate that people put the time into this stuff and make it available. But I have one request: would it kill you to include the namespace references in your sample code? There are thousands of classes in .NET - I hate having to try and figure out where these classes are hiding in order to get my code to compile (esp. since I’m normally looking at this code because I'm not familiar with the class or classes required to do whatever it is I'm trying to accomplish. Having said that, I never realized VS would actually help resolve these references for me. If you right-click on a type (in this case, I right-clicked on File), there is a Resolve Namespace option on the content menu:
Very nice!
A while back I needed a routine which would display the first few hundred characters of a longer chunk of text. Obviously, it's easy enough to do that with the Substring method of a string. However, I wanted to do this on a word boundary (I didn't want to end up with half of the word being displayed). So I wrote a simple routine which broke up the string into an array (using Split()), then rebuilt the string (keeping track of the length along the way). Maybe 25 lines or code or so. It seemed to work OK, so I was good to go. The other day I realized I needed to adjust the code to strip out HTML markup before displaying the text - you can imagine how "nice" that might look if I happened to chop off an ending tag somewhere. I knew I had a third party library to do this; there are a lot of really useful little routines hiding in the West Wind Web Store .NET 2.0. So I took a look through the library and found what I was looking for. While looking for it, I noticed another routine which appeared to do exactly the same thing I wanted. Except it was like 5 lines and much easier to understand. Doh! 30 seconds later I rewrote my routine. Here it is... public static string TruncateString(string source, int maxLength, string ending) { // Do we even have to truncate it? if (source.Length <= maxLength) return source; string text = source.Substring(0, maxLength); text = text.Substring(0, text.LastIndexOf(" ")); return text; }
|