FooReader.NET

Syndication with RSS and Atom is about the sharing of information. In order to view information from several different sources, an application called an aggregator is used to assimilate the different feeds into one location. An aggregator makes it easier and faster to stay up-to-date with information collected from around the Web (much easier than visiting several web sites each day). The rest of this chapter focuses on the design and development of such an aggregator.

FooReader.NET is a web-based, .NET RSS/Atom aggregator ported from ForgetFoo's ColdFusion-based FooReader (http://reader.forgetfoo.com/). With many conventional applications filling the aggregator void, including popular e-mail applications, why build a web-based RSS/Atom aggregator? Consider the following reasons:

The Web is cross-platform. Building a web-based aggregator ensures that anyone with a modern browser can access it.
The Web is centrally located. One of the problems with conventional aggregators that are installed on the computer is the upkeep of data in many locations. If you like to read syndicated feeds at work and at home, you must install an aggregator on each computer and load it with the appropriate feeds. A web-based aggregator eliminates this problem because any change made to the feed list is seen regardless of the user's location.

The next sections explain how FooReader.NET is built using Ajax. As with any web application, there are two main components: client-side and server-side.

Client-Side Components

As described earlier, the client-side components of an Ajax solution are in charge of presenting the data to the user and communicating with the server. For FooReader.NET, several client-side components are necessary to manage the overall user experience.

XParser, the JavaScript class responsible for requesting information and parses it when the data is received.
The user interface ties the user to their data. Because the user interface is essentially a web page, the usual suspects of web browser technologies are used: HTML, CSS, and JavaScript.
The JavaScript code that takes the information XParser received and displays it to the UI.

Note

Although JavaScript has no formal definition of classes, it does have the logical equivalent. To aid in your understanding, this text refers to functions that create objects as classes.

XParser

The first component in FooReader.NET is XParser, a JavaScript class that parses RSS and Atom feeds into JavaScript objects that can be easily used in web applications. The primary goal of XParser is to provide an interface for developers to easily access the most important elements. Not only does this save lines of code (not to mention download time), but extra, unnecessary work for the client as well.

The class-centric .NET Framework primarily inspires its design: with three classes comprising the main XParser class: XParserItem, XParserElement, and XParserAttribute.

Starting with the simplest class in XParser, XParserAttribute is a representation of an element's attribute. Attributes tend to have only one desired piece of data: its value, and that is the only property of XParserAttribute.

function XParserAttribute(oNode) {
    this.value = oNode.nodeValue;
}

The XParserAttribute constructor takes one argument, the DOM attribute node. From this node, the attribute's value is accessed and stored to the value property by the node's nodeValue property. Simple, eh?

The XParserElement class represents an XML element and is responsible for accessing and retrieving the element's value and attributes. The class's constructor accepts two arguments: the XML element's node and the value of that node.

function XParserElement(oNode,sValue) {
    this.node = oNode || false;
    this.value = sValue || (this.node && this.node.text) || false;

In these first few lines, two of the four class properties are set according to the values of the parameters. Notice the use of the OR (| |) operator in the assignment.

In the assignment of node, using the OR operator is a shorthand version of the ternary operator and produces the same results as this.node = (oNode)?oNode:false;. The assignment of value works more like an if…else if block.

if (sValue) {
    this.value = sValue;
}else if (this.node && this.node.text) {
    this.value = this.node.text;
} else {
    this.value = false;
}

The shorthand version removes over half of the characters of the if...else if block, and cutting excess characters is always good.

Note

There is no text property for an element in the Firefox DOM. To gain this functionality, XParser uses the zXml library introduced in previous chapters, which extends the Firefox DOM to include the text and xml properties.

The next few lines build the attributes collection, a collection of XParserAttribute objects.

if (this.node) {
    this.attributes = [];
    var oAtts = this.node.attributes;
    for (var i = 0; i < oAtts.length; i++) {
        this.attributes[i] = new XParserAttribute(oAtts[i]);
        this.attributes[oAtts[i].nodeName] = new
            XParserAttribute(oAtts[i]);
    }
} else {
    this.attributes = false;
}
this.isNull = (!this.node && !this.value && !this.attributes);

The existence of the node is checked first. (There's no sense in creating a collection of attributes of a node that doesn't exist.) Then an array is created, and the element's attributes are collected through the DOM property attributes. Using a for loop, the attribute node is used in the creation of XParserAttribute objects; one using an integer key, and one using the attribute's name as a key that is retrieved with the nodeName property. The latter is to allow easy access to specific attributes where the name of the attribute is known.

If the element's node does not exist, attributes is set to false. (You can't divide by 0, and you can't get attributes from a node that doesn't exist.) Last, isNull is assigned. The isNull property enables you to check if the resulting object is null or not. If the node, value, and attributes properties are all false, then the XParserElement object is considered to be null.

The XParserItem class represents an <rss:item/> or an <atom:entry/> element (from now on, these elements will be referred to as simply items). Items consist of several elements, so naturally there will be quite a bit of parsing. The XParserItem constructor takes one argument called itemNode, which represents the item's DOM node.

function XParserItem(itemNode) {
    this.title=this.link=this.author=this.description=this.date =
        new XParserElement();

This first line of this class is important. Even though RSS and Atom are standards, it does not necessarily mean that everyone follows the given standards. An RSS feed may leave out the <author/> element in one or all items, or an Atom feed may disregard the <content/> tag in favor of the <summary/> tag. Because of these discrepancies, it is important that the XParserItem properties have a default value in order to avoid errors down the road. This default value is a null XParserElement. Now it is time to start parsing the child elements.

for (var i = 0; i < itemNode.childNodes.length; i++) {
    var oNode = itemNode.childNodes[i];
    if (oNode.nodeType == 1) {
        switch (oNode.tagName.toLowerCase()) {

To begin, a for loop cycles through the child elements. The node's type is then checked; if nodeType is 1, then the current node is an element. Checking a node's type may seem like an odd thing to do, but it is necessary. Mozilla can count white space between elements as children; therefore, checking the nodeType avoids errors when the node is sent to XParserElement. When it is confirmed that the current node is indeed an element, the tag name is used in a switch…case block and the item's properties are assigned according to their tag name counterparts.

//Shared Tags
case "title":
    this.title = new XParserElement(oNode);
break;
case "link":
    if (oNode.getAttribute("href")) {
        this.link = new XParserElement(oNode,oNode.getAttribute("href"));
    } else {
        this.link = new XParserElement(oNode);
    }
break;
case "author":
    this.author = new XParserElement(oNode);
break;

Although different in many ways, RSS and Atom do have a few elements in common. In this code these like elements are the <title/>, <link/>, and <author/> elements. The only difference is in the <link/> element; in an Atom feed the desired value is located in the href attribute. The RSS value is simply the element's value.

//RSS Tags
case "description":
    this.description = new XParserElement(oNode);
break;
case "pubdate":
    this.date = new XParserElement(oNode);
break;

Following the shared elements are the RSS-specific elements, where the <description/> and <pub-date /> elements are matched and sent to XParserElement.

//Atom Tags
case "content":
    this.description = new XParserElement(oNode);

break;
case "issued":
    this.date = new XParserElement(oNode);
break;

In this code, the Atom-specific elements <content/> and <issued/> are matched. The last elements checked are extensions:

//Extensions
case "dc:date":
    this.date = new XParserElement(oNode);
break;
default:
break;

This code checks for the <dc:date/> element, a part of the Dublin Core extension (http://dublincore.org/documents/dcmi-terms/). This extension is widely used, so it is a good idea to check for it and use its value.

If for some reason a feed does not have an element in the switch block, the property representing the element defaults to a blank XParserElement object (from the first line of the XParserItem class). Because XParser is a JavaScript class, not every element needs to be parsed. However, the use of switch easily allows for the addition of other elements, if needed.

The XParser class is the main class that encompasses all the previous classes discussed. Its constructor accepts an argument sFileName and an optional argument called bIsXml. For maximum flexibility, XParser is designed either to make its own XMLHttp requests or to have the responseText of an external request passed to the constructor. If bIsXml is false or undefined, then sFileName is treated as a URL and XParser will make its own request; otherwise, sFileName is treated as an XML string and will be loaded into an XML DOM object.

var oThis = this;
this.title=this.link=this.description=this.copyright=this.generator= this.modified=
    this.author = new XParserElement();
this.onload = null;

It may seem strange to assign a variable to contain the object's reference, but this technique comes in handy when dealing with the onreadystatechange() event handler of an XMLHttp object, which will be seen later. The next line of code is similar in idea and function to the first line in XParserItem. It sets the object's properties that represent important elements with a default value. The following line declares onload, an event handler, which fires when the feed is completely loaded.

XParser's load() method is called when the XML data is retrieved. When XML is passed to the constructor, the load() method is immediately called and the XML data contained in sFileName is passed to the method.

if (bIsXml) {
    this.load(sFileName);
}

However, when a URL is passed to the constructor (and bIsXml is false or undefined), XParser makes its own request.

else {
    var oReq = zXmlHttp.createRequest();
    oReq.onreadystatechange = function () {
        if (oReq.readyState == 4) {
            //only if "OK"
            if (oReq.status == 200) {
                oThis.load(oReq.responseText);
            }
        }
    };
    oReq.open("GET", sFileName, true);
    oReq.send(null);
}

Note that this code uses the zXmlHttp cross-browser factory introduced earlier in the book using the onreadystatechange() event, the status of the request is checked and the responseText is passed to the load() method when the request is successful. This is where the oThis variable comes into play. Had this.load(oReq.responseText) been used, an error would have been thrown stating this.load() is not a function. This particular error is thrown because while inside the onreadystate change event handler, the this keyword references the event handler, not the XParser object.

Finally, the request is sent via the send() method.

The load() method is called only when XML data is ready to parse. It takes one argument, sXml, which is the XML data to parse.

XParser.prototype.load = function (sXml) {
    var oXmlDom = zXmlDom.createDocument();
    oXmlDom.loadXML(sXml);

In this code, the XML data is loaded into an XMLDOM object created from the cross-browser XML DOM factory discussed in Chapter 4. Now that the DOM is ready to go, the parsing can begin.

this.root = oXmlDom.documentElement;

The first step is to parse the simple properties of the file. Using the root property, which references the XML document element, it's possible to determine what type of feed is being parsed.

this.isRss = (this.root.tagName.toLowerCase() == "rss");
if (this.isRss && parseInt(this.root.getAttribute("version")) < 2) {
    throw new Error("RSS version is less than 2");
}
this.isAtom = (this.root.tagName.toLowerCase() == "feed");
this.type = (this.isRss)?" RSS":" Atom";

To find out what type of feed it is, the document's root element is checked. If the tag name is rss, then a version of RSS is being used. If the document's root element is feed, then it is an Atom feed. The Boolean properties isRss and isAtom are assigned their values according to the feed type. Last, the type property is set to reflect the feed type. This particular property is displayed to the user.

Note

If the feed is RSS, it is important to check the version of the feed. XParser was written to parse RSS 2.x feeds, and if the version is less than 2, an error is thrown and all parsing is stopped.

Both RSS and Atom have elements that contain information about the feed itself, but they are located in different parts of the document. In RSS, this information is contained in a <channel /> element that encloses all other elements in the feed. Atom, on the other hand, uses its root element to enclose the information. This similarity can be used to parse the feed easily:

var oChannel = (this.isRss)?this.root.getElementsByTagName("channel")[0]:this.root;

If the feed is RSS, the oChannel variable is set to the <channel/> element; otherwise, it's set to the document element of the feed. After this common starting point, parsing the non-items can begin:

for (var i = 0; i < oChannel.childNodes.length; i++) {
    var oNode = oChannel.childNodes[i];
    if (oNode.nodeType == 1) {

A for loop is used to loop through the children of the channel. Once again, the nodeType property of the current node is checked. If confirmed to be an element, parsing continues. Just like XParserItem, a switch block is used.

switch (oNode.tagName.toLowerCase()) {
    //Shared Tags
    case "title":
        this.title = new XParserElement(oNode);
    break;
    case "link":
        if (this.isAtom) {
            if (oNode.getAttribute("href")) {
                this.link = new XParserElement(oNode,oNode.getAttribute("href"));
            }
        } else {
           this.link = new XParserElement(oNode);
        }
    break;
    case "copyright":
        this.copyright = new XParserElement(oNode);
    break;
    case "generator":
        this.generator = new XParserElement(oNode);
    break;

RSS and Atom have quite a few elements in this part of their specifications that are similar, and this makes life easier (and keeps the amount of code down). The main difference in these few elements is the <link/> element (just like in XParserItem). Atom feeds contain a <link/> element, but the value you want to use is an attribute. Using the getAttribute() method, the href attribute's value is retrieved and is included in the XParserElement constructor. Next are the RSS-specific elements.

//RSS Tags
case "description":
    this.description = new XParserElement(oNode);
break;
case "lastbuilddate":
    this.modified = new XParserElement(oNode);
break;

case "managingeditor":
    this.author = new XParserElement(oNode);
break;

And then the Atom-specific elements.

//Atom Tags
case "tagline":
    this.description = new XParserElement(oNode);
break;
case "modified":
    this.modified = new XParserElement(oNode);
break;
case "author":
    this.author = new XParserElement(oNode);
break;
default:
break;

The feed's informational elements are parsed, and then it is time to create and populate the items array. As the name implies, the items array is a collection is of XParserItem objects. The <rss:item/> and <atom:entry/> elements will be looped through and sent to the XParserItem constructor:

var oItems = null;
if (this.isRss) {
    oItems = oChannel.getElementsByTagName("item");
} else {
    try {
        oXmlDom.setProperty('SelectionLanguage', 'XPath');
        oXmlDom.setProperty("SelectionNamespaces",
            "xmlns:atom='http://www.w3.org/2005/Atom'");
        oItems = oXmlDom.selectNodes("/atom:feed/atom:entry");
    } catch (oError) {
        oItems = oChannel.getElementsByTagName("entry");
    }
}

Since the Microsoft XML DOM version 4, the getElementsByTagName() method has changed some what. In version 3 and below, qualified names were ignored when using getElementsByTagName(), so retrieving any element was as simple as passing the tag name to the method.

In version 4 and later, however, selecting an element with a default namespace requires the use of the selectNodes() method, which takes an XPath expression. The setProperty() method, which sets additional properties for the XML DOM parser, is used to set the namespace in order make selections with selectNode() possible. In the previous example, a try…catch block is used to discern which method should be used. The newer selectNodes() method is tried; if it fails, then getElementsByTagName() is used to retrieve the elements.

Important

selectNodes() is an Internet Explorer-only method, as it is a part of MSXML. Mozilla-based browsers continue to allow getElementsByTagName() to retrieve elements with a default namespace.

After the retrieval of the <rss:item/> and <atom:entry/> elements, each individual node is passed to the XParserItem class constructor to create the items array:

for (var i = 0; i < oItems.length; i++) {
    this.items[i] = new XParserItem(oItems[i]);
}

At this point, all of the important elements are parsed and contained in their corresponding properties. The only thing left to do is fire the onload event:

if (typeof this.onload == "function") {
    this.onload();
}

When onload was first declared, it was assigned the value of null. Because of this, it is important to check the onload type. If it is a function (meaning that the event now has a handler), then the onload() method is called. Use of this event will be seen later in the chapter.

The User Experience

The user interface may be the most important part of any application; if the user cannot use the application, there is no reason for the application to exist. FooReader.NET was designed to be easily used and understood by the user. In fact, it borrows heavily from the Microsoft Outlook 2003 user interface. It has a threepane interface where the first two panes are a fixed width while the third pane is fluid (see Figure 5-1).

Figure 5-1

The first pane, called the feeds pane, displays the different feeds as links that the user can click on. A <div/> element with an id of divFeedList in the feeds pane allows the feeds list to be written to the document. Because this pane is populated dynamically, only the containing elements are statically written:

<div id=" divFeedsPane">
    <div class=" paneheader">Feeds</div>
    <div id=" divFeedList"></div>
</div>

When the user clicks on a feed, the feed items are loaded into the middle pane: the items pane. This second pane has two elements that are used to display information. The first is a <div/> element with an id of divViewingItem. This element displays two things to the user: which feed they are currently reading and the feed document type (RSS or Atom). The second element is another <div/> element whose id attribute is set to divItemList. This element will contain a list of <item/> RSS elements or <entry/> Atom elements. Like the feeds pane, only the containing elements are statically written to the page:

<div id=" divItemPane">
    <div class=" paneheader">Items</div>
    <div id=" divViewingItem"></div>
    <div id=" divItemList"></div>
</div>

If the user single-clicks an item, it loads the item into the last pane: the reading pane. This pane has three elements that display the item's information. The first, whose id is divMessageTitle, is where the <title/> element of RSS and Atom feeds is displayed. The second element has an id of aMessageLink whose href attribute is changed dynamically. Finally, the last element is divMessageBody, where the <rss:description/> and <atom:content/> elements are displayed:

<div id=" divReadingPane">
    <div class=" contentcontainer">
        <div class=" messageheader">
            <div id=" divMessageTitle"></div>
            <a href="" id=" aMessageLink" title=" Click to goto posting."
                 target="_new">Travel to Post</a>
        </div>
        <div id=" divMessageBody"></div>
    </div>
</div>

Usability

There are a few usability issues that you should be concerned with. For one, users expect web applications to function like any other application. In Outlook 2003, from which this application borrows heavily, double-clicking an item in the items pane pulled up the specific e-mail message in a new window. Since FooReader.NET is based on that model, double-clicking an item opens a new window and takes the user to the specific blog post or article. To achieve this, the ondblclick event on the items is assigned the doubleClick() handler (you'll see this assignment later):

function doubleClick() {
    var oItem = oFeed.items[this.getAttribute("frFeedItem")];
    var oWindow = window.open(oItem.link.value);
}

Second, the user needs to know when the application is working to fill a request. The only time a request is made to the server is when a feed is loaded, so some type of visual cue needs to show the user that something is happening. For this purpose, a loading screen is shown when a feed is requested and is hidden when the feed is loaded. Figure 5-2 shows this user interface cue in action.

Figure 5-2

This loading cue is controlled by a JavaScript function called toggleLoadingDiv(). It takes one argument, a Boolean value that determines whether the cue is shown:

function toggleLoadingDiv(bShow) {
    var oToggleDiv = document.getElementById("loadingDiv");
    oToggleDiv.style.display = (bShow)?" block":" none";
}

Server-Side Components

In a perfect world, a simple application such as FooReader.NET would be strictly client-side. JavaScript would be able to retrieve XML feeds across domains with XMLHttp, and there would be no need to make any calls to a server component. Because of the Internet Explorer and Firefox security restrictions, however, it is not possible to retrieve data from a different domain; thus, a server-side component is required.

Possible Paradigms

The server's job in FooReader.NET is to retrieve the remote XML feeds for the client to use. Following this model, there are two possible design paths for the server; both have their pros and cons.

The first method is a cached feed architecture. The server program would act as a service, fetching a list of feeds at a certain time interval, caching them, and serving the cached feeds to the client when requested. This option potentially saves bandwidth, but it also risks the reader not having up-to-date feeds. More user action would be required to display the current, up-to-date feeds, which goes against the Ajax ideology.

The second method is a delivery on demand architecture, where the server would retrieve any given feed when the user requests it. This may use more bandwidth, but it ensures the reader will have up-to-date information; moreover, this design is inline with the Ajax concepts and is what the user would expect.

Implementation

FooReader.NET uses the delivery on demand model, with the exception that a feed is cached when it is fetched. The cached version is used only in the event that the remote host cannot be contacted and an up-to-date feed cannot be retrieved. This ensures that the user has something to read, even though it is older data.

Because the server is responsible only for pulling and caching remote feeds, it makes sense to have one ASP.NET page responsible for these operations. This page, called xml.aspx, will have a code-behind file where a good deal of ASP.NET code is contained.

Note

Code-behind is a method for authoring web pages for the ASP.NET platform. Unlike inline programming models, where the server-side code is interspersed with HTML markup (like PHP and ASP), codebehind enables you to remove all logic from the HTML code and place it in a separate class file. This results in a clean separation of HTML and your .NET programming language of choice.

The entry point for the server-side is the Page_Load event handler, where a method called StartFooReader() is called. It is in StartFooReader() where your code will be contained.

The language of choice for this project is C#, which is the language created specifically for the .NET Framework.

Setting the Headers

For this application, you must set a few headers. Settings headers in ASP.NET is a simple task:

Response.ContentType = "text/xml";
Response.CacheControl = "No-cache";

Headers are set with the Response object, which encapsulates HTTP response information. Setting the MIME content type is imperative to the operation of the application. Mozilla-based browsers will not load an XML file as XML unless the MIME specifies an XML document, and "text/xml" is one of many types that do this.

It is also important to make sure that the XML data retrieved with XMLHttp is not cached. Internet Explorer (IE) caches all data retrieved with XMLHttp unless explicitly told not to with the CacheControl header. If this header is not set, IE will use the cached data until the browser's cache is dumped.

Getting the Remote Data

To determine the feeds to display, FooReader.NET uses a proprietary XML document, feeds.xml, which contains a list of feeds that are available for the user to request. This file contains a set of <link/> elements divided into sections by <section/> elements. Each <link/> element has a filename attribute which is similar to an id attribute in HTML; it must be unique, and it is an identifier for the <link/> element:

<?xml version="1.0" encoding=" utf-8"?>
<feeds>
    <section name=" News">
        <link name=" Yahoo! Top Stories" filename=" yahoo_topstories"
href="http://rss.news.yahoo.com/rss/topstories" />
    </section>
</feeds>

The example shows a basic feeds list. A typical list can contain as many <section/> and <link/> elements as you desire. Note that <section/> elements can only be children of the root element, and <link/> elements can only be contained in <section/> elements.

Note

The name attribute of the <section/> and <link/> elements is displayed to the user in the feeds pane and is not used in any other operation.

When requesting a feed, the value of the filename attribute is assigned to the xml variable in the query string.

xml.aspx?xml=fileName

In ASP.NET, the Request object contains a NameValueCollection called QueryString. Using this collection, you can extract the value of the xml variable in the query string:

if (Request.QueryString["xml"] != null)
{
    string xml = Request.QueryString["xml"];
    FeedsFile feedsFile = new FeedsFile(Server.MapPath("feeds.xml"));

In the first line, the existence of the xml variable in the query string is checked. That value is then assigned to the xml variable, and a FeedsFile object is then instantiated.

The FeedsFile class contains a method called GetLinkByFileName, which returns a FeedsFileLink object and contains the information of a specific <link/> element. A string variable, fileName, is assigned the value of the cached feed's path that is used later.

FeedsFileLink link = feedsFile.GetLinkByFileName(xml);
string fileName = string.Format(
    @"{0}\xml\{1}.xml",Server.MapPath(String.Empty),link.FileName);

→