/    Sign up×
Community /Pin to ProfileBookmark

simplhtmldom ISSUE

Folks,

Check this Xml SItemap Link Extractor

[code]
include_once(‘simplehtmldom_1_9_1/simple_html_dom.php’);

//Works.
//$sitemap = ‘https://www.rocktherankings.com/post-sitemap.xml’;
//$sitemap = “https://www.rocktherankings.com/sitemap_index.xml”; //Has more xml files.

//Does not work. Shows blank page.
$sitemap = “https://bytenota.com/sitemap.xml”;

$html = new simple_html_dom();
$html->load_file($sitemap);

foreach($html->find(“loc”) as $link)
{
echo $link->innertext.”<br>”;
}
[/code]

It manages to extract links of html files aswell as xml files from these 2 xml sitemaps:
**$sitemap = ‘https://www.rocktherankings.com/post-sitemap.xml‘; Has no further Xml Sitemaps listed.
$sitemap = “https://www.rocktherankings.com/sitemap_index.xml“; //Lists more xml files.**

So far, so good.
But why it fails to extract further xml sitemap links from this following particular xml sitemap ? That is the big issue!
https://bytenota.com/sitemap.xml

to post a comment
PHP

3 Comments(s)

Copy linkTweet thisAlerts:
@novice2022authorNov 07.2022 — Hiya,

This ain't right, is it ?
<i>
</i>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!-- This sitemap was dynamically generated on October 26, 2022 at 11:39 pm by All in One SEO - the original SEO plugin for WordPress. --&gt;

&lt;?xml-stylesheet type="text/xsl" href="https://bytenota.com/default.xsl?sitemap=root"?&gt;
&lt;urlset
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:image="https://www.google.com/schemas/sitemap-image/1.1"
<i>&gt;</i>
<i> </i>&lt;url&gt;
<i> </i> &lt;loc&gt;&lt;![CDATA[https://bytenota.com/]]&gt;&lt;/loc&gt;
<i> </i> &lt;lastmod&gt;&lt;![CDATA[2022-04-11T12:02:14+00:00]]&gt;&lt;/lastmod&gt;
<i> </i> &lt;changefreq&gt;&lt;![CDATA[always]]&gt;&lt;/changefreq&gt;
<i> </i> &lt;priority&gt;&lt;![CDATA[1]]&gt;&lt;/priority&gt;
<i> </i>&lt;/url&gt;
<i> </i>

What is the CDATA ?

Would xml sitemaps come like this with the CDATA or is this coding error on the site's behalf ?

https://bytenota.com/sitemap.xml
Copy linkTweet thisAlerts:
@SempervivumNov 07.2022 — >What is the CDATA ?

I encountered CDATA in some old javascripts and as it's outdated in that context I didn't deal with it.

However it's well explained on Wikipedia:

https://en.wikipedia.org/wiki/CDATA
Copy linkTweet thisAlerts:
@novice2022authorDec 05.2022 — Cheers!

I just visited the forum after one month or more.
×

Success!

Help @novice2022 spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 4.26,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,

tipper: @Samric24,
tipped: article
amount: 1000 SATS,
)...