Daily Archives: May 6th, 2008

XML is the Extensible Markup Language, this is a structured document for containing large amounts of information. This information can be make up of words and images. The XML structure is designed to identify and isolate the different parts of information and content contained within the structured document. XML is a full mark-up programming language and can be closely compared due to similarities with the HTML programming language, though they are very different. XML has many different uses and was created so lots of different information can be contained within it, unlike HTML where it is only designed for specific content to be used and the tags used and pre-defined. In XML the tags are custom so there is no tag list meaning there is no already set information regarding the content of the tags.

XML is stored in a plain text format and many programs can read the language but only specialised ones can interprit it depending on what you are looking to use it for. XML has many different uses, according to http://www.w3schools.com/xml/xml_syntax.asp some of these include:

Seperating data from HTML - This is where information for websites can be stored in an external or seperate XML file and read later by another language (html, php) when being pulled on the page, the advantages of this are that the html code can be dedicated to other features of styling the appearance of the website or organising placement of content.

Exchanging data - XML allows easier storage of data and information, programs can be used that make storing information inside XML files more secure and easy to access making exchanging it between software (such as the Microsoft Office applications) much easier.

Creating new software languages - As an example the WML (Wireless Markup Language) which is used for mobile technology on hand held devices, XML is the parent language of this one.

There are many other uses for the language including the use of RSS which is also written in XML. The language has very specific syntax rules based on how it should appear and be written, a peice odf code that follows these standards is known as one that is ‘well-formed’. A well-formed document will be easily read by any program or website very easily with no problems at all, though if a document fails to reach every standard it becomes not well-formed and attempts to read it are surely going to fail. To be well formed there are over 100 rules that must be followed, these are available to read here W3 XML. The W3 also defines a simple set of rules that must be followed for well-formed to be acheived

  • XML documents must have a root element
  • XML elements must have a closing tag
  • XML tags are case sensitive
  • XML elements must be properly nested
  • XML attribute values must be quoted

Source: http://www.w3schools.com/Xml/xml_dtd.asp

They also give an example of what a simple well formed peice of XML would look like, it complies to the simplest of standards with an opening and a closing tag, a root node and all elements within the code are properly nested.

<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

The elements within the structured text must be in a tree format, tizag.com shows how the tree structure fits in easily with the XML concept shown using indenting here is an example:

<inventory>
	<drink>
		<lemonade>
			<price>$2.50</price>
			<amount>20</amount>
		</lemonade>
		<pop>
			<price>$1.50</price>
			<amount>10</amount>
		</pop>
	</drink>

	<snack>
		<chips>
			<price>$4.50</price>
			<amount>60</amount>
		</chips>
	</snack>
</inventory>

You can easily see that the structure is the same principal as HTML, and within each of the “nodes” are more nodes which gives it the tree structure that appears. Within all XML must be a root element which in this case is the inventory, within that are descending nodes like drink which is the ancestor node of lemonade. Each of the nodes include a closing tag so it is known where they begin and end, and which node is the ancestor of which. Here is a diagram showing how this is formed within a tree:

Source: http://www.tizag.com/xmlTutorial/xmltree.php

XML is useful for many different applications and is used widely for the web and other resources, it is excellent at storing and compressing data but relies upon specific software to read it. It has been implimented in various different popular applications including Microsoft Office 2007 which uses the XML file format as it’s own format.

Resources

http://www.developer.com/tech/article.php/797861

http://www.tizag.com/xmlTutorial/xmltree.php

http://www.xml.com/pub/a/98/10/guide0.html?page=2#AEN63

http://assit1.ucsm.ac.uk:8080/bodington/site/ist205/week7/reading/xml-apps.pdf

Apache is the most popular web server available on the internet, Apache Web-Server is an open-source project meaning that the source code and all information about it is publicly available to anyone who wishes to take it. Often the code is adapted and changed from user input and submit back to the Apache Group who can update it for public releases. The software was originally designed for use on UNIX servers only as an alternative to others but is now available for Windows Operating Systems. The package is licensed under the GNU Public meaning that its operation is free on any Linux or Unix system. A survey conducted by company Netcraft in February 2008 showed that over 50% of websites use the Apache web server

Developer January 2008 Percent February 2008 Percent Change
Apache 78,735,581 50.61% 80,580,183 50.93% 0.33
Microsoft 55,709,926 35.81% 56,265,527 35.56% -0.24
Google 8,290,471 5.33% 8,169,930 5.16% -0.16
lighttpd 1,536,981 0.99% 1,565,536 0.99% 0.00
Sun 557,673 0.36% 547,510 0.35% -0.01

Source: http://news.netcraft.com/archives/web_server_survey.html

These figures show the huge popularity of the Apache server, but this is not the only web-server available as there are many others that have gained recent popularity. You can see from the figures that Microsoft dominate the market second to Apache, with over 30% of the web using their Windows Server which a study in 2007 operated by search-this.com showed that Microsoft Server was used mostly on websites that are for Fortune 500 companies in America, Microsoft operates 50% of this market and with a suprise Apache operates a small 15%, staying even with companies Sun and IBM. You can see this in the following chart:

Source: http://www.search-this.com/2007/06/27/microsoft-iis-vs-apache-who-serves-more/

In the recent months Google has added an addition 1.1million websites to the internet due to the rise of Google Blogger which has boosted their figures in the market making them now one of the most popular blogging systems available, but suggestions as to why they are less popular along with Apache toward Fortune 500 companies is because most are older companies and will have had Microsofts server systems implimented in the past and do not wish to change to a more modern and popular approach.

PostgreSQL is Object Relational Database Management System or ORDBMS, like Apache the software is Open-Source meaning that anyone can access and modify the source code without infringing copyright laws, althought this is not the only version of PostgreSQL available as their is a commercial version which can be purchased for the Red Hat Linux operating system, as PostgreSQL usually operates on Windows. The support for PostgreSQL is very wide due to the fact that it is free and commercial for anyone to use, making it popular within certain circles compared to proffessional alternatives, the main competetor for this software is MySQL.

MySQL is possibly the most popular ORDBMS used on the internet, it is also open source and by far has the most literature and support available for it, in an article by oreillynet.com there are claims that the rapid growth of MySQL can only be outdone by the rapid growth of Apache and it has outdone the PostrgreSQL even though they are equally ranged in technical features. The site also posted a comparison between the two to compare the benefits of each:

POSTGRESQL MYSQL
ANSI SQL compliance Closer to ANSI SQL standard Follows some of the ANSI SQL standards
Performance Slower Faster
Sub-selects Yes No
Transactions Yes Yes, however InnoDB table type must be used
Database replication Yes Yes
Foreign key support Yes No
Views Yes No
Stored procedures Yes No
Triggers Yes No
Unions Yes No
Full joins Yes No
Constraints Yes No
Windows support Yes Yes
Vacuum (cleanup) Yes No
ODBC Yes Yes
JDBC Yes Yes
Different table types Yes

Overall there are many advantages and disadvantages to each of them but it depends on personal preferences which to use.

Resources

http://www.ariadne.ac.uk/issue19/what-is/

http://www.faqs.org/docs/ppbook/c208.htm

http://ftp.uoi.gr/pub/databases/mysql/doc/refman/5.0/en/what-is-mysql.html

http://www.search-this.com/2007/06/27/microsoft-iis-vs-apache-who-serves-more/

http://www.webopedia.com/TERM/A/Apache_Web_server.html

http://news.netcraft.com/archives/web_server_survey.html

An RSS is a system for allowing users to view information in the simplest possible format, it is a technology that allows specific reading software to download information (or “feeds”) from a website so a person can read it at their own leisure. It’s difficult to define what RSS stands for are there are many different opinions, through research the most popular 2 that I have found are Rich Site Summary and Really Simple Syndication.

RSS - A technology that allows web users to receive (ongoing, constantly updated) information collected from many sources through a simple reader. This is supplied through an “RSS feed” that users can subscribe to. - http://iws.cit.cornell.edu/iws2/technology/techinfo.cfm

The above source defines RSS as being ongoing and consistently updated meaning that information is constantly being downloaded to the software, and the website must be updated regularly. This means that there are many practical applications for this on the internet, but the most popular use would be for News websites, where information is updated many times every day and every time the program is used it downloads all of the feeds it hasn’t yet downloaded and sorts them via date.

RSS not only contains information based on the content of the site, but also information regarding how it was published for the software to read, this includes the date of publish, the title of the page and content, any images that are used, the language of the content and the website that the information is taken from. RSS is not just used for News, but many other websites such as popular shopping site Amazon integrate it with their sites to offer more features for their customers, such as keeping track of new items up for sale within a specific catagory or watching price changes on or even waiting for stock to change.

RSS is an application for the programming language XML (Extensible Markup Language), and it complies to the RDF standard which is:

A set of rules (a sort of language) for creating descriptions of information, especially information available on the World Wide Web. - http://www.unitedyellowpages.com/internet/terminology.html

The structure of RSS must comply to the standards of XML 1.0 when being used to be able to be recognised by the reading software that is downloading it. With using the XML structure, RSS works very similar to Markup languages like HTML, from http://www.codeproject.com/KB/tips/RSS.aspx you can see the basic structure of how RSS needs to be layed out using the XML structure:

<item>
    <title>My Articles</title>
    <link>www.MyCollection.com/articles</link>
    <description>list of Articles written by me.</description>
</item>

http://www.codeproject.com/KB/tips/RSS.aspx

http://www.amazon.com/gp/tagging/rss-help.html

http://iws.cit.cornell.edu/iws2/technology/techinfo.cfm

A blog short for ‘weblog’ or ‘Blogging’ is a the new modern version of a journal, the only difference being that it is live on the internet and anyone (usually) can read it, blogger.com describes blogging as giving users their own voice on the web. Information contained within a blog ranges from proffessional to private thoughts and have a varied audience ranging through ages. Most blogs on the internet have some form of feedback associated so users do not just read them, but can leave comments with opinions or information on the webpage, so ideas and content can be shared. Blogging can also be compared to content management systems, as they both are very similar, where a content management system is essentially a way to organise and manage data blogging is the same, many of the websites such as this one offer options for continously editing and organizing data.

There are many different ways to blog, websites such as these offer features to quickly publish and style websites using pre-made layouts, but there is software available that allows blogging to be run personally on your own server so it can be customized however you want it. The Blogging system I tested was Thingamablog (http://thingamablog.sourceforge.net/) this application was very easy to use, and allowed me to customize and create blogs very quickly and effortlessly, the program has lots of fancy features that allow users to quickly and easily update their blogs from wherever they are, at work, school or whatever. There are advantages to using this kind of software and approach to operating blog, as they are designed to be user friendly but due to the fact that they have to be installed onto a server and managed suggests that anyone using it would need some computer experience or atleats learn how to do this. With the online Blogging websites and applications it becomes a lot easier for novice users to manage as all of the features are easy to understand and find. In September 2007 blogging website problogger.com issued a survey to find out how popular blogging had become and how long people had been blogging for, there were 2151 responses and results show:

Source: http://www.problogger.net/archives/2007/09/17/how-long-have-you-been-blogging-poll-results/

In comparison with a Content Management System (CMS), blogging is more centered towards what is being written on the website and what the content on the site should be, although it still organises the content in the same way that CMS does it’s primary focus is on giving the user an easy method of adding their information and letting the blogging application or website take care of managing the look and feel of the website including the layout and style which are customizable. These days blogging is extremely popular and there are lots of available systems out there for creating these, some are focused mainly on customizing the look and feel of it like http://www.typepad.com where users can customize the layout of the site using CSS. Or some which are more directly based on the content like http://www.blogger.com and http://www.wordpress.com. Althrough these aren’t the only sorts of websites that offer blogging features, websites like MySpace and Bebo have built in blogging systems, the wide variety on these kinds of websites make them incredibly popular, classing them as advanced content management systems rather than being just for blogging.

Resources:

http://www.bbc.co.uk/webwise/askbruce/articles/browse/blogging_1.shtml

http://www.blogger.com/about

http://codex.wordpress.org/Introduction_to_Blogging