SimpleXML is probably the easiest way to parse a XML document. We have created a class that only has to be instantiated to get the feed’s data and added a utility function to display it in a tabular format (HTML table). We have also added some simple CSS styling to the table to make it look better than the browser’s default rendering. It has to be noted that this is just for demonstration and teaching’s sake and it could be better to use XMLReader instead of SimpleXML for lengthy XML files because SimpleXML processes the inputs at once by writing them to the memory which is not a good idea if you have a big XML file.
This is a practical tutorial of using SimpleXML to read XML files.
We start with the standard html skeleton, include our WordPressFeed class and start feeding it with feeds.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
|
<!DOCTYPE html>
<!– To get the XML feed of a WordPress page you enter: http://www.page.com/feed or http://www.page.com/comments/feed or the relevant blog URL of the feed –>
<html lang=“en”>
<head>
<style>
table.feed_table {
font-family:‘Arial’, sans-serif;
width:75%;
margin:25px auto;
}
.feed_tabletr:nth-of-type(even) {
background-color:#eee;
}
.feed_tableth {
font-size:1.8em;
padding:10px;
color:#fff;
background-color:#333;
}
.feed_table td {
border:1px solid #aaa;
text-align: center;
}
</style>
<meta charset=“UTF-8″>
<title>WordPress Feed Fetcher</title>
</head>
<body>
<?phprequire_once(“WordPressFeed.php”);
// Those two lines can be used to display a WP feed in an HTML table
$phpgang=newWordPressFeed(“http://www.phpgang.com/feed/”);
echo$phpgang->showTabular();
$infosec=newWordPressFeed(“http://resources.infosecinstitute.com/feed/”);
echo$infosec->showTabular();
$phpgang_posts=$phpgang->rawResults;
?>
</body>
</html>
|
You can see that we have included the styles for the table in the head but you can include them in an external stylesheet and change them easily.
Then we define our class (comments are directly inserted in the code for the class):
Feed’s data can be retrieved the following way:
1
|
$phpgang=newWordPressFeed(“http://www.phpgang.com/feed/”);
|
Then the raw results (an array of all posts/articles and their data) can be accessed with the public property:
1
|
$phpgang->rawResults;
|
The utility function to display the data in a table can be used the following way:\
1
|
echo $phpgang->showTabular();
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
|
<?php
// 30 seconds may not be enough
ini_set(‘max_execution_time’, 90);
/*
*You can add methods to show in different ways, save to database. Anything you wish.
*You can also modify the class so it checks whether the elements exist or make it search for attributes or missing elements or whatever you want.
*/
classWordPressFeed {
//$site holds the name of the site (“phpgang”, “lada”, etc.)
private$site;
// $feedURL is the argument that you pass to the constructor
private$feedURL;
// $rawResults is an array with all articles/posts.
public$rawResults;
//creates $site
privatefunctiongetSite() {
//get only the site’s name and assign it to $site
$site_arr=explode(“.”, $this->feedURL);
array_pop($site_arr);
$this->site=array_pop($site_arr);
}
//gets the feed’s data upon instantiation
publicfunction__construct($feedURL) {
$this->feedURL=$feedURL;
$this->rawResults=$this->getFeed();
$this->getSite();
}
publicfunctionshowTabular() {
// shows the feed as a table
if (count($this->rawResults) >0) {
?>
<h1 style=“text-align: center;”>Feed for Website: <?phpecho$this->site?></h1><hr />
<table class=“feed_table”>
<tr>
<th>Title</th>
<th>URL</th>
<th>Author</th>
<th>Categories</th>
<th>Description</th>
</tr>
<?php
foreach ($this->rawResultsas$articleElement) {
echo“<tr>”;
echo“<td>{$articleElement[‘title’]}</td>”;
echo“<td>{$articleElement[‘link’]}</td>”;
echo“<td>{$articleElement[‘author’]}</td>”;
//create a string from all array indices
$categories=implode(“, “, $articleElement[‘categories’]);
//remove html tags such as images and links from the description
$description=strip_tags($articleElement[‘description’]);
echo“<td>$categories</td>”;
echo“<td>$description</td>”;
echo“</tr>”;
}
echo“</table>”;
}
}
privatefunctiongetFeed() {
//get feed’s data
// if it is an .xml file we need to use simplexml_load_file()
$feed=file_get_contents($this->feedURL);
$allArticles=array();
$xml=simplexml_load_string($feed);
//loop through each post
foreach ($xml->channel->itemas$item) {
$category=array();
//add the data about that post to a variable with categories being an array of all the categories for the post
$article[‘title’] =htmlspecialchars((string)$item->title);
$article[‘link’] =htmlspecialchars((string)$item->link);
// get the creator element that is in the dc namespace
$namespaces=$item->getNameSpaces(true);
$dc_namespace=$item->children($namespaces[‘dc’]);
$article[‘author’] =htmlspecialchars((string)$dc_namespace->creator);
foreach ($item->categoryas$single_category) {
$category[] =htmlspecialchars((string)$single_category);
}
$article[‘categories’] =$category;
$article[‘description’] = (string) $item->description;
//add the article to the array of articles
$allArticles[] =$article;
}
return$allArticles;
}
}
|
It is important to cast the XML elements as string otherwise they would be SimpleXML objects. Also, in XML elements entitled something like dc:creator mean that the creator element is in the dc namespace and that namespace has to be accessed and the creator element retrieved.
You cannot simply access it with something like:
1
|
$item->{“dc:creator”};
|
When we give it a WordPress feed and call the showTabular() method on the class we get a page that resembles something like the graphic below:
To get the code for this article, please visit: http://www.phpgang.com/using-php-and-simplexml-to-parse-wordpress-feeds_824.html
Be First to Comment